Minimizing a quadratic function using gradient descent

7.3k Views Asked by At

I have the following function

$$f(x,y) = (x + y − 1)^2 + (x − 2y)^2 + (2x − 4y + 3)^2$$

Could you suggest the best way to choose the step-size? And is $f(x,y)$ strongly convex? Thank you very much.

2

There are 2 best solutions below

0
On

Suppose we are given a convex quadratic function

$$f (\mathrm x) := \frac 12 \mathrm x^\top \mathrm A \,\mathrm x - \mathrm b^\top \mathrm x + c$$

where matrix $\rm A$ is symmetric and positive semidefinite. The gradient of $f$ is

$$\nabla f (\mathrm x) = \mathrm A \mathrm x - \mathrm b$$

Using gradient descent with step $\mu$,

$$\begin{array}{rl}\mathrm x_{k+1} &= \mathrm x_k - \mu \nabla f (\mathrm x_k)\\ &= \mathrm x_k - \mu \left( \mathrm A \mathrm x_k - \mathrm b \right)\\ &= \left( \mathrm I - \mu \mathrm A \right) \mathrm x_k + \mu \mathrm b\end{array}$$

Lastly, choose step $\mu$ such that the (real) eigenvalues of $\mathrm I - \mu \mathrm A$ are in the open interval $(-1,1)$.

0
On

Your function can be written as, $$f(x, y) = \begin{bmatrix}x \\ y\end{bmatrix}^{\top}\begin{bmatrix}6 & -9 \\ -9 & 21\end{bmatrix}\begin{bmatrix}x \\ y\end{bmatrix} + \begin{bmatrix}10 \\ -26\end{bmatrix}^{\top}\begin{bmatrix}x \\ y\end{bmatrix} + 10$$

The Hessian of this function is, $$\nabla^2f = 2\begin{bmatrix}6 & -9 \\ -9 & 21\end{bmatrix}$$

This Hessian is positive definite (quite easy to prove using Sylvester's Criterion or just computing the eigenvalues). Therefore, yes, your function is strongly convex.

Instead of using gradient descent, you can directly solve by taking the gradient and setting it to zero, $$\nabla f = 2\begin{bmatrix}6 & -9 \\ -9 & 21\end{bmatrix}\begin{bmatrix}x \\ y\end{bmatrix} + \begin{bmatrix}10 \\ -26\end{bmatrix}$$

Setting this to zero would give us, $$\begin{bmatrix}x \\ y\end{bmatrix} = \frac{1}{2}\begin{bmatrix}6 & -9 \\ -9 & 21\end{bmatrix}^{-1}\begin{bmatrix}-10 \\ 26\end{bmatrix} = \frac{1}{15}\begin{bmatrix}4 \\ 11\end{bmatrix}$$