I'm reading a book about Gradient methods right now, where the author is using a Taylor series to explain/derive an equation.
$$ \mathbf x_a = \mathbf x - \alpha \mathbf{ \nabla f } (\mathbf x ) $$
Now he says the first order expansion of the Taylor series around x would look like this:
$$ \mathbf{ f } (\mathbf x_a ) = \mathbf{ f } (\mathbf x ) + \mathbf{ \nabla f } (\mathbf x )'( \mathbf x_a - \mathbf x ) + \mathbf o ( || \mathbf x_a - \mathbf x || )$$
and the simplifies it to:
$$ \mathbf{ f } (\mathbf x_a ) = \mathbf{ f } (\mathbf x ) + \alpha || \mathbf{ \nabla f } (\mathbf x )||^2 + \mathbf o (\alpha)$$
Now I don't get this part $ \mathbf o ( || \mathbf x_a - \mathbf x || ) $. As far as I know it's not part of the Taylor Series. Furthermore since I don't know what $ \mathbf o () $ is I can't understand how he can simplify it to $ \mathbf o ( \alpha ) $.
$o()$ refers to the Landau notation.
$$ f ( x_\alpha ) = f ( x ) + { \nabla f } ( x )'( x_a - x ) + o ( || x_a - x || )$$
Plugin the definition of $x_a$
$$= f ( x ) + \nabla f ( x )'( x - \alpha \nabla f ( x ) - x ) + o ( || x - \alpha { \nabla f } ( x ) - x || ) $$
$$ = f ( x ) + \nabla f ( x )'( - \alpha \nabla f ( x ) ) + o ( ||- \alpha { \nabla f } ( x ) || ) \qquad \qquad \quad$$
And by definition of $o$ and $||\cdot||$
$$ = f ( x ) -\alpha \underbrace{ \nabla f ( x )' \nabla f ( x )}_{||\nabla f(x)||^2} + \underbrace{o ( \alpha || { \nabla f } ( x ) || )}_{o(\alpha)} \qquad \qquad \qquad \quad \: \: \:$$