Gradient Steepest Descent

22 Views Asked by At

In the book I am currently reading, the steepest descent is described as follows: $$\min_{\mathbf{x}} \frac{1}{2}x'Qx - x'b$$ Let this quadratic problem be the initial position and Q must be positive semidefinite. Then the first-order condition is given as: $$Qx^* = b$$ Then as usual for gradient search, iterate over: $$x = x - a \cdot g$$ with gradient $$g = Qx - b$$ This results in the optimum a: $$\underset{x}{\text{argmin}} \: f(x - ag) = a = \frac{g'g}{g'Qg} $$ Thus steps are the eigenvectors of Q, which are orthogonal to each other, which shows up in the typical "zig-zagging" of Steepest Descent.

Now related to this I got 3 questions:

  1. What is meant by Q? Given a function $x_1^2 + x_1 \cdot x_2 + x_2^2$ is it meant to be the restated problem(?): $\begin{bmatrix} x_1 & x_2 \end{bmatrix} \cdot \begin{bmatrix} 1 & 0.5 \\ 0.5 & 1 \end{bmatrix} \cdot \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} $ But for which functions does that work? Only for quadratic or can I rewrite any (convex) function like that?
  2. I cannot see how the eigenvectors of Q are involved. Can anyone show how we get from the optimal a to the eigenvectors of Q?
  3. I was told that steepest descent implicitly contains the assumptions that the Hessian matrix of the function is rather stable. How is that implied in this approach?