Minimize quadratic function with gradient descent method

Question

Minimize quadratic function with gradient descent method

526 Views Asked by Bumbble Comm At 24 Apr 2026 - 4:28

Anyone could help me wih this question, please?

I made an exam and I didn't know how I can solve this.

$A \in \mathbb{R}^{n \times n}$ is a matrix symmetric positive definite, $b \in \mathbb{R}^{n\times 1}$ and we have $f(x) = \dfrac{1}{2}x^TAx -b^Tx$. Give a point $x^0 \in \mathbb{R}^{n \times 1}$, show that the gradient method descent given by $$ x^{k+1}=x^k -\left(||A||_{2}\right)^{-1} \nabla f(x^k)$$ converges to the global minimum.

So, this $|| \cdot||_2$ is an Euclidean norm.

I made this:

$|| A||_2 = || A|| = \sup\{||Ax||:||x|| =1 \} $, and I thought A has positive real eigenvalues $0 < \lambda_1 \leq \lambda_2 \leq \cdots\leq \lambda_n$ and that I would obtain an orthonormal basis of the matrix $A$ formed by the eigenvectors $v_1, v_2, ..., v_n$ associated with the respective eigenvalues $\lambda_i$. So, how $||Av|| = ||\lambda v|| = |\lambda| \;||v|| = \lambda$ I suppose that $||A|| = \lambda_n = \lambda$ and so $||A||^{-1}=\dfrac{1}{\lambda}$. Now, I have the following sequence $$ x^{k+1}=x^k -\dfrac{1}{\lambda} \nabla f(x^k). $$

I have argued that $f$ has only a local minimum and that it is global. Also said that the minimum is given by $x^{*} =A^{-1}b$.

But I could not show that the sequence above converges to $x^{*}$.

Then I thought I did not need to show that I was converging to $x^{*}$. It is enough to show that it converged, for it guarantees that it would be for this element due to its uniqueness (only having a global minimum).

But I could not solve it. It reminded me of contraction. I have already tried a recurrence by doing $$ x^{k+1} - x^0 = -\dfrac{1}{\lambda}\left( \nabla f(x^0) + \nabla f(x^1) + \cdots +\nabla f(x^k) \right) $$ but I could not move forward.

I also tried to think in $||x^{k+1}-x^{k} || = ||-\frac{1}{\lambda} \nabla f(x^k) || = \frac{1}{\lambda} || \nabla f(x^k) ||$, which was when it reminded me of a contraction. But I could not finish anything either. Incidentally, this holds if $1 / \lambda$ is less than $1$ and I could not guarantee that.

Could you help me? I already did the test, but I wanted to see the solution.

PS: I found this, but I can't understand. https://en.wikipedia.org/wiki/Modified_Richardson_iteration

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Answer 1 · 2022-05-09 11:02:22

Taylor expanding $ f\big(x^{k+1} \big) = f \Big(x^k - \Vert A \Vert_2^{-1} \nabla f\big(x^k\big) \Big) $ up to second order, you obtain an identical representation (recall that $f(x)$ is quadratic) \begin{align} f\big(x^{k+1} \big) &= f \Big(x^k - \Vert A \Vert_2^{-1} \nabla f\big(x^k\big) \Big) \\ &= f\big(x^k \big) + \nabla f\big(x^k \big)^T \Big[ - \Vert A \Vert_2^{-1} \nabla f\big(x^k\big)\Big] + \frac12 \Vert A \Vert_2^{-2} \nabla f\big(x^k\big)^T A \nabla f\big(x^k\big) \\ \Vert A \Vert_2 \Big[f\big(x^{k+1} \big) - f\big(x^k \big) \Big] &= \nabla f\big(x^k \big)^T \Big[ - I \nabla f\big(x^k\big)\Big] + \frac12 \Vert A \Vert_2^{-1} \nabla f\big(x^k\big)^T A \nabla f\big(x^k\big) \\ \Vert A \Vert_2 \Big[f\big(x^{k+1} \big) - f\big(x^k \big) \Big] &= \nabla f\big(x^k \big)^T \bigg[ \frac12 \Vert A \Vert_2^{-1} A - I \bigg] \nabla f\big(x^k\big). \end{align}

Now recall that if $\lambda$ is an eigenvalue of $A$, $\omega A - I$ has eigenvalues $\omega \lambda - 1, \lambda \in \sigma(A)$ for $\omega \in \mathbb R$. Thus in our case, $\frac12 \Vert A \Vert_2^{-1} A - I$ has eigenvalues $$\frac{\lambda}{2\Vert A \Vert_2} -1 = \frac{\lambda}{2 \lambda_\max } -1. $$ Since all eigenvalues $\lambda$ of $A$ are real and positive ($A$ is s.p.d.), we know have that all eigenvalues of $\frac12 \Vert A \Vert_2^{-1} A - I$ are negative. Consequently, $\frac12 \Vert A \Vert_2^{-1} A - I$ is negative definite and $$ \nabla f\big(x^k \big)^T \bigg[ \frac12 \Vert A \Vert_2^{-1} A - I \bigg] \nabla f\big(x^k\big) < 0 \: \forall \: \nabla f\big(x^k \big) \neq 0. $$

Thus, $$ f\big(x^{k+1} \big) - f\big(x^k \big) < 0 \: \forall \: k, \: \forall \: \nabla f\big(x^k \big) \neq 0.$$ In other words, the function value will always decrease, unless the gradient is zero, which is the case when you are at the unique minimum.

Minimize quadratic function with gradient descent method

There are 1 best solutions below

Related Questions in CALCULUS

Related Questions in MULTIVARIABLE-CALCULUS

Related Questions in OPTIMIZATION

Trending Questions

Popular # Hahtags

Popular Questions