I am reading this paper, where the PDHG algorithm is described the following way:
Let $f,g,h$ be closed, proper and convex functions. And further let $x\in \mathbb{R}^n,y\in\mathbb{R}^m,z\in\mathbb{R}^l$ and $ A\in\mathbb{R}^{l,n}, B\in\mathbb{R}^{l,m}$. These are used in the Lagrangian of the form $$ L(x,y,z) = f(x) + g(y) + \langle Ax+By -c,z\rangle - h(z). $$ In the paper the step sizes $\mu,\nu,\tau>0$ are chosen, such that $$ 1 > \tau \mu \lambda_{\max}(A^TA) + \tau\nu\lambda_{\max}(B^TB) $$ holds, ensuring convergence to a saddle point. The iteration scheme is $$ x^{k+1} = \underset{x}{\arg\min}\ L(x,y^k,z^k) + \frac{1}{2\mu} ||x-x^k||_2^2 \\ y^{k+1}=\underset{y}{\arg\min}\ L(x^k,y,z^k) + \frac{1}{2\nu} ||y-y^k||_2^2 \\ z^{k+1} = \underset{z}{\arg\max}\ L(2x^{k+1}-x^{k}, 2y^{k+1}-y^{k}, z) - \frac{1}{2\tau}||z - z^k||_2^2. $$
What baffles me, is that the algorithm is usually presented in a simpler form. I.e. without a second matrix $B$ and only with step sizes $\tau,\mu > 0$ solving the simpler problem
$$ L(x,y,z) = f(x) + g(y) + \langle Ax-y,z\rangle - h(z). $$ with the simpler inequality $$1 > \tau \mu \lambda_{\max}(A^TA).$$
I would like to know how the inequality-condition for the general case is emerging. I could not find anything about that.