Convergence of gradient descent for linear least squares

291 Views Asked by Bumbble Comm At 12 Apr 2026 - 1:30

I'm trying to prove that when I use gradient descent for the least squares optimization problem $x^* = arg min_{x \in \mathbb{R}^n} \frac 1 2 \| Ax - b \|_2^2$ with the gradient descent rule $x^{t+1} = x^t - \frac{1}{\sigma_{max}(A)^2} \nabla f(x^t)$ then the procedure converges as $\|x^{t+1} - x^*\| \le \Big(1- \frac{\sigma_{min}(A)^2}{\sigma_{max}(A)^2}\Big) \|x^t - x^* \|^2$ where $\sigma_{max}$ denotes the maximum eigenvalue of the argument matrix and $\|\cdot\|_2$ is the euclidean norm.

I already computed $\nabla f(x) = A^tAx -b^tA$ and $\sigma_{max}(I - \alpha A^tA) = 1- \alpha \sigma_{min}(A)^2$.

I'm stuck with the term $\|x^{t+1} - x^*\| = \|(I-\alpha A^t A) x - b A - A^{-1}b\|$. how do I get rid of the constant terms that are subtracting?

Original Q&A

There are 1 best solutions below

Bumbble Comm On 22 Apr 2022 - 2:17 BEST ANSWER

Your gradient should be $A^T A x - A^T b$. From there you expression becomes

$$ \| x^{t+1} - x^{\ast} \| = \| (I - \alpha A^T A) x^t - x^{\ast} + \alpha A^T b\| $$

Note that $b = A x^{\ast}$, so you can actually collect

$$ - x^{\ast} + \alpha A^T b = -x^{\ast} + \alpha A^T A x^{\ast} = (\alpha A^{T} A - I) x^{\ast}. $$

Convergence of gradient descent for linear least squares

There are 1 best solutions below

Related Questions in CONVEX-OPTIMIZATION

Related Questions in MACHINE-LEARNING

Related Questions in GRADIENT-DESCENT

Trending Questions

Popular # Hahtags

Popular Questions