Gradient decent using Taylor Series

5.2k Views Asked by At

I'm reading a book about Gradient methods right now, where the author is using a Taylor series to explain/derive an equation.

$$ \mathbf x_a = \mathbf x - \alpha \mathbf{ \nabla f } (\mathbf x ) $$

Now he says the first order expansion of the Taylor series around x would look like this:

$$ \mathbf{ f } (\mathbf x_a ) = \mathbf{ f } (\mathbf x ) + \mathbf{ \nabla f } (\mathbf x )'( \mathbf x_a - \mathbf x ) + \mathbf o ( || \mathbf x_a - \mathbf x || )$$

and the simplifies it to:

$$ \mathbf{ f } (\mathbf x_a ) = \mathbf{ f } (\mathbf x ) + \alpha || \mathbf{ \nabla f } (\mathbf x )||^2 + \mathbf o (\alpha)$$

Now I don't get this part $ \mathbf o ( || \mathbf x_a - \mathbf x || ) $. As far as I know it's not part of the Taylor Series. Furthermore since I don't know what $ \mathbf o () $ is I can't understand how he can simplify it to $ \mathbf o ( \alpha ) $.

2

There are 2 best solutions below

5
On BEST ANSWER

$o()$ refers to the Landau notation.

$$ f ( x_\alpha ) = f ( x ) + { \nabla f } ( x )'( x_a - x ) + o ( || x_a - x || )$$

Plugin the definition of $x_a$

$$= f ( x ) + \nabla f ( x )'( x - \alpha \nabla f ( x ) - x ) + o ( || x - \alpha { \nabla f } ( x ) - x || ) $$

$$ = f ( x ) + \nabla f ( x )'( - \alpha \nabla f ( x ) ) + o ( ||- \alpha { \nabla f } ( x ) || ) \qquad \qquad \quad$$

And by definition of $o$ and $||\cdot||$

$$ = f ( x ) -\alpha \underbrace{ \nabla f ( x )' \nabla f ( x )}_{||\nabla f(x)||^2} + \underbrace{o ( \alpha || { \nabla f } ( x ) || )}_{o(\alpha)} \qquad \qquad \qquad \quad \: \: \:$$

0
On

The equation \begin{equation*} f(x_a) = f(x) + \nabla f(x)^T (x_a - x) + o(\|x_a - x \|) \end{equation*} is a short way of saying that if $r(x_a)$ is defined by the equation \begin{equation*} f(x_a) = f(x) + \nabla f(x)^T (x_a - x) + r(x_a), \end{equation*} then $r(x_a)$ is small even when compared with $\| x_a - x\|$.

More precisely, \begin{equation*} \lim_{x_a \to x} \frac{|r(x_a)|}{\|x_a - x\|} = 0. \end{equation*}