Derivation of Newtons Method from Taylor Series

346 Views Asked by At

I want to derive Newton’s method from a second order Taylor approximation of the objective function.

So starting from the Taylor expansion: $f ^ { k } ( x ) = f \left( x ^ { k } \right) + \nabla f \left( x ^ { k } \right) ^ { T } \left( x - x ^ { k } \right) + \frac { 1} { 2} \left( x - x ^ { k } \right) ^ { T } \nabla ^ { 2} f \left( x ^ { k } \right) \left( x - x ^ { k } \right)$

I am computing the derivative w.r.t. to x and set it to 0, which should give:

$ \nabla f \left( x ^ { k } \right) + \nabla ^ { 2} f \left( x ^ { k } \right) \left( x - x ^ { k } \right) = 0 $

The derivatives for the first to terms are clear to me, for the third term I know I can rewrite it in the following form:

$\frac { 1} { 2} \left( x - x ^ { k } \right) ^ { T } \nabla ^ { 2} f \left( x ^ { k } \right) \left( x - x ^ { k } \right) = \frac { 1} { 2} \nabla ^ { 2} f \left( x ^ { k } \right) \left( x - x ^ { k } \right)^2 $

Which also clearly gives me the right solution.

However starting from:

$\frac { 1} { 2} \left( x - x ^ { k } \right) ^ { T } \nabla ^ { 2} f \left( x ^ { k } \right) \left( x - x ^ { k } \right)$

Shouldn't I also be able to calculate the derivative using the following rule:

For the special case where A is a symmetric matrix and

$ \alpha = x^T A x $

where x is n×1, A is n×n, and A does not depend on x, then

$ \frac { \partial \alpha } { \partial x } = 2x ^ { T } A $

To my understanding this would give me:

$ \left( x - x ^ { k } \right)^T\nabla ^ { 2} f \left( x ^ { k } \right) $

Which clearly has dimension: $1 \times n$ instead of $n \times 1$ and therefore clearly seems not right to me.

So where is the error in my thinking?