I have implemented the Chambolle-Pock Algorithm i. e. for a problem of the form
$$\min_x \, \|Kx\| + \delta_H (x),$$
where $H = \{x: Ax = b\}$. In the iteration scheme, it is required to compute the proximal mapping of $\delta_H$, which is just the projection onto the affine subset $H.$ This requires the computation of the pseudoinverse of $A$, so it is needed to compute $(AA^T)^{-1}$. This however poses an issue, since that computation is exponentially more time expensive than every other step in my algorithm. I thought about maybe calculating $\operatorname{Prox}_{\delta_H}$ by means of the Moreau decomposition, but that just replaces the projection onto $H$ with the projection on its complement, which doesn't help at all. My question is now, is there a way to avoid this issue? Either by computing the projection in a more efficient manner or maybe one can avoid it completely?
Thanks in advance!
You are right, you do not need to project explicitly onto $H$.
We can rewrite your problem as $$\min_x f(Kx) + g(Ax),$$ where $f(u)= \|u\|$ and $g(v) = \delta_{\{b\}}(v)$. Let's now use the Fenchel-Legendre transformation to obtain a saddle point problem: $$\min_x \max_{y_1, y_2} \langle Kx, y_1\rangle - f^*(y_1) + \langle Ax, y_2\rangle - g^*(y_2). $$ For simplicity we introduce a new notation $y = (y_1^T, y_2^T)^T$, $B = [K^T, A^T]^T$, and $h(y) = f^*(y_1)+g^*(y_2)$. Then the above problem can be cast as $$\min_x \max_y \langle Bx,y\rangle - h(y).$$ Notice that $\text{Prox}_{h} y = (\text{Prox}_{f^*}y_1, \text{Prox}_{g^*}y_2)$ and you can use a fore-mentioned Moreau decomposition to obtain explicit formula. Now I hope implementation of the Chambolle-Pock algorithm should be straight-forward.
It will need four matrix-vector multiplications per iteration: two with $K$ and two with $A$. The algorithm takes form: \begin{align*}x^{n+1} & = x^n - \tau B^T y^n\\ y^{n+1} & = \text{Prox}_{\sigma h} (y^n+\sigma B(2x^{n+1}-x^n)), \end{align*} where the latter equation can be written more explicitly as \begin{align*} y^{n+1}_1 & = \text{Prox}_{\sigma f^*}( y^n_1 +\sigma K(2x^{n+1}-x^n)) \\ y^{n+1}_2 & = y^n_2+\sigma A(2x^{n+1}-x^n) -\sigma b. \end{align*} We used here that $\text{Prox}_{\sigma g^*}v =v-\sigma b$. Hence, the only thing you have to do now is to use a Moreau decomposition for computing $\text{Prox}_{\tau f^*}$. Of course, in order to ensure convergence you have to choose steps $\tau$ and $\sigma$ in a proper way: $\tau \sigma \|B\|^2 <1$. Alternatively, you can use an extension of Chambolle-Pock algorithm which does not require to compute $\|B\|$.