I want to derive Newton’s method from a second order Taylor approximation of the objective function.
So starting from the Taylor expansion: $f ^ { k } ( x ) = f \left( x ^ { k } \right) + \nabla f \left( x ^ { k } \right) ^ { T } \left( x - x ^ { k } \right) + \frac { 1} { 2} \left( x - x ^ { k } \right) ^ { T } \nabla ^ { 2} f \left( x ^ { k } \right) \left( x - x ^ { k } \right)$
I am computing the derivative w.r.t. to x and set it to 0, which should give:
$ \nabla f \left( x ^ { k } \right) + \nabla ^ { 2} f \left( x ^ { k } \right) \left( x - x ^ { k } \right) = 0 $
The derivatives for the first to terms are clear to me, for the third term I know I can rewrite it in the following form:
$\frac { 1} { 2} \left( x - x ^ { k } \right) ^ { T } \nabla ^ { 2} f \left( x ^ { k } \right) \left( x - x ^ { k } \right) = \frac { 1} { 2} \nabla ^ { 2} f \left( x ^ { k } \right) \left( x - x ^ { k } \right)^2 $
Which also clearly gives me the right solution.
However starting from:
$\frac { 1} { 2} \left( x - x ^ { k } \right) ^ { T } \nabla ^ { 2} f \left( x ^ { k } \right) \left( x - x ^ { k } \right)$
Shouldn't I also be able to calculate the derivative using the following rule:
For the special case where A is a symmetric matrix and
$ \alpha = x^T A x $
where x is n×1, A is n×n, and A does not depend on x, then
$ \frac { \partial \alpha } { \partial x } = 2x ^ { T } A $
To my understanding this would give me:
$ \left( x - x ^ { k } \right)^T\nabla ^ { 2} f \left( x ^ { k } \right) $
Which clearly has dimension: $1 \times n$ instead of $n \times 1$ and therefore clearly seems not right to me.
So where is the error in my thinking?