Nocedal 3.25. Calculate $\frac{\partial \phi}{\partial \alpha}$ where $\phi(\alpha) = f(x - \alpha \nabla f)$ and $f(x) = \frac{1}{2} x^T Qx - b^T x$

29 Views Asked by At

I'm trying to understand the calculation of the following equation 3.25 in the textbook "Numerical Optimization by Nocedal and Wright":

Nocedal 3.25

Here is my work:

\begin{gather*} f: \mathbb{R}^n \to \mathbb{R}, \quad x,b \in \mathbb{R}^n, \quad Q \in \mathbb{R}^{n \times n} \\ f(x) = \frac{1}{2} x^T Qx - b^T x \\ \end{gather*}

where $Q$ is symmetric and positive definite. Then the gradient is:

\begin{gather*} \nabla f(x) = Qx - b \\ \end{gather*}

Let:

\begin{gather*} \phi(\alpha) = f(x - \alpha \nabla f) = \frac{1}{2} (x - \alpha \nabla f)^T Q(x - \alpha \nabla f) - b^T (x - \alpha \nabla f) \\ \end{gather*}

Calculate the partial derivative $\frac{\partial \phi}{\partial \alpha}$:

\begin{gather*} \frac{\partial \phi}{\partial \alpha} = - \frac{1}{2} \nabla f^T Q(x - \alpha \nabla f) - \frac{1}{2} (x - \alpha \nabla f)^T Q \nabla f + b^T \nabla f \\ = - \frac{1}{2} \nabla f^T (Qx) + \frac{1}{2} \alpha (\nabla f)^T (Q \nabla f) - \frac{1}{2} x^T (Q \nabla f) + \frac{1}{2} \alpha (\nabla f)^T (Q \nabla f) + b^T \nabla f \\ = - \frac{1}{2} \nabla f^T (Qx) + \alpha (\nabla f)^T (Q \nabla f) - \frac{1}{2} x^T (Q \nabla f) + b^T \nabla f \\ \end{gather*}

Set $\frac{\partial \phi}{\partial \alpha} = 0$ and solve for $\alpha$:

\begin{gather*} \alpha = \frac{\frac{1}{2} \nabla f^T (Qx) + \frac{1}{2} x^T (Q \nabla f) - b^T \nabla f}{(\nabla f)^T (Q \nabla f)} \\ \end{gather*}

The textbook gets the solution:

\begin{gather*} \alpha = \frac{\nabla f^T \nabla f}{(\nabla f)^T (Q \nabla f)} \\ \end{gather*}

where did I go wrong?

1

There are 1 best solutions below

0
On BEST ANSWER

Both of you are right.

Note that $Q$ is symmetrical, \begin{align} \alpha &= \frac{\frac{1}{2} \nabla f^T (Qx) + \frac{1}{2} x^T (Q \nabla f) - b^T \nabla f}{(\nabla f)^T (Q \nabla f)} \\ &=\frac{\frac{1}{2} x^T (Q\nabla f) + \frac{1}{2} x^T (Q \nabla f) - b^T \nabla f}{(\nabla f)^T (Q \nabla f)} \\ &=\frac{ x^T (Q \nabla f) - b^T \nabla f}{(\nabla f)^T (Q \nabla f)} \\ &= \frac{(Qx-b)^T\nabla f}{(\nabla f)^T (Q \nabla f)}\\ &= \frac{\nabla f^T\nabla f}{(\nabla f)^T (Q \nabla f)}\\ \end{align}