Solving $\mathbf{Bx}=\mathbf{z}$ with $\mathbf{B^T B} = \mathbf{A^T A} + \lambda \mathbf{I}$ for large matrix $\mathbf{A}$

60 Views Asked by At

I come across a problem to computationally solve $\mathbf{x}$ from $\mathbf{Bx}=\mathbf{z}$ with $\mathbf{B^T B} = \mathbf{A^T A} + \lambda \mathbf{I}$ for large matrix $\mathbf{A}$. The variable $\mathbf{z}$ is a known $n \times 1$ vector, $\mathbf{A}$ is an $n \times n$ matrix and $\lambda$ is a real number. The problem is that the matrix $\mathbf{A}$ is very large so it is almost impossible to save every elements of $\mathbf{A}$, but we can compute $\mathbf{Ax}$ and $\mathbf{A^T x}$ for every $n\times 1$ vector of $\mathbf{x}$.

If $\lambda = 0$ (such that $\mathbf{B} = \mathbf{A}$), I can solve it by minimizing $\mathcal{L}(\mathbf{x}) = \frac{1}{2} \| \mathbf{z} - \mathbf{Ax} \|^2$ using the family of gradient descent algorithms, with gradient $\nabla_{\mathbf{X}} \mathcal{L} = -\mathbf{A^T} (\mathbf{z} - \mathbf{Ax})$. I tried applying the same method with $\lambda \neq 0$, and obtain the gradient, $\nabla_{\mathbf{X}} \mathcal{L} = -\mathbf{B^T} \mathbf{z} + \mathbf{B^T Bx}$. I can calculate the second term on the right hand side of the equation, but I got stuck on calculating the $\mathbf{B^T} \mathbf{z}$ term. Any ideas on solving this problem?