Boyd & Vandenberghe, Problem 9.6 — Gradient descent step for a quadratic function

380 Views Asked by At

The question in Boyd & Vandenberghe's Convex Optimization is very simple — calculate the gradient descent step of a quadratic function. However, the given answer is very confusing. How can we directly get the equation (yellow part in the picture given below) without any further information?

link_to_the_problem_picture

1

There are 1 best solutions below

0
On

The question is why $$ \begin{bmatrix}x_1^{(k)}\\x_2^{(k)}\end{bmatrix}= \left(\frac{\gamma-1}{\gamma+1}\right)^k\begin{bmatrix}\gamma\\(-1)^k\end{bmatrix}. $$ It is because for all previous steps the step $t$ is chosen to be optimal (from the exact line search). For example, the first step for the initial point $(\gamma,1)$: $$ \begin{bmatrix}x_1^{(1)}\\x_2^{(1)}\end{bmatrix}= \begin{bmatrix}(1-t)\gamma\\1-\gamma t\end{bmatrix}. $$ To pick the optimal $t$, we need to minimise $$ \frac12(x_1^2+\gamma x_2^2)=\frac12(1-t)^2\gamma^2+\frac12\gamma(1-\gamma t)^2. $$ Differentiating w.r.t. $t$ and setting the derivative to zero gives $$ t_{opt}=\frac{2}{\gamma+1}, \qquad 1-t_{opt}=\frac{\gamma-1}{\gamma+1},\qquad 1-\gamma t_{opt}=-\frac{\gamma-1}{\gamma+1}. $$ Doing that for all other iterations you get the yellow formula.