Deriving the Taylor expansion $f(x+p) = f(x) + \nabla f(x+tp)^Tp$

1.1k Views Asked by At

I'm trying to derive the Taylor formula:

$$f(x+p) = f(x) + \nabla f(x+tp)^Tp$$

For that I think tha I just need to use the formula for one variable taylor expansion and follow like here: https://math.stackexchange.com/a/222217/166180. This answer kinda explains the formula I need but for an infinite expansion. I need a finite expansion which guarantees there's a $t$ in which the expansion is exact.

I couldn't find a specific taylor theorem like this, so I'm trying to derive this forula and I think it has to do with the mean value theorem:

$$f'(c) = \frac{f(x)-f(a)}{x-a}$$

for some $c\in(a,x)$

so $f'(c)x-f'(c)a = f(x)-f(a) \implies f(x) = f(a) + f'(c)x -f'(c)a$

It kinda looks like something I want.

UPDATE:

Take $a=0$ to get

$f(x) = f(0) + f'(c)x$

Call $f$ as $\Phi$ to get:

$$(1) = \Phi(t) = \Phi(0) + \Phi'(c)t$$

for some $c\in(0,t)$

If we take $\Phi(t) = \phi(x + pt)$ so

$$\Phi'(c) = \lim_{a\to c}\frac{\phi(x+ct)-\phi(x+at)}{c-a} = \ ?$$

I need $\Phi'(c)$ to finish $(1)$ by writing everything in terms of $\phi$ and hopefully achieve the formula with the gradient.

PS: how would I achieve the second order expansion

$$f(x+p) = f(x) + \nabla f(x)^Tp + \frac{1}{2}p^T\nabla^2f(x+tp)p$$

?

I can't think of a second order version of the mean value theorem.

2

There are 2 best solutions below

1
On BEST ANSWER

As long as your function $f$ is a real-valued function of a vector-variable you can apply your favorite remainder form of Taylor's theorem from calculus 101 to the auxiliary function $$\phi(t):=f\bigl({\bf x}+t{\bf p}\bigr)\ .$$ E.g., if all the necessary partial derivatives of $f$ are continuous, you have $$f\bigl({\bf x}+{\bf p}\bigr)=\phi(1)=\sum_{j=0}^r {\phi^{(j)}(0)\over j!}+{\phi^{(r+1)}(\tau)\over(n+1)!}$$ for some $\tau\in\>]0,1[\>$. Now express the derivatives of $\phi$ by the partial derivatives of $f$, using repeatedly the chain rule. In the case $r=0$ you obtain $$f\bigl({\bf x}+{\bf p}\bigr)=f({\bf x})+\nabla f({\bf x}+\tau{\bf p})\cdot{\bf p}\ ,$$ and when $r=1$ you have $$f\bigl({\bf x}+{\bf p}\bigr)=f({\bf x})+\nabla f({\bf x})\cdot{\bf p}++{1\over2}\sum_{i,k=1}^n f_{.ik}({\bf x}+\tau{\bf p})\> p_ip_k\ .$$ Here the second partials $f_{.ik}({\bf x}+\tau{\bf p}):={\partial^2 f\over\partial x_i x_k}({\bf x}+\tau{\bf p})$ arise from the chain rule when you compute $\phi''({\bf x}+\tau{\bf p})$.

0
On

Lets write $X(t) = x + tp$, with $t\in [0,1]$, so that $f(x+tp) = \Phi(t)$. Then your first question is answered by a simple application of chain rule, $$\Phi'(t) = \frac d{dt}(f(X(t))) = \nabla f(X(t)) \cdot \frac d{dt}X = \nabla f(x+tp)\cdot p$$ so that we can rewrite $\Phi(1) = \Phi(0) + \Phi'(t)$ as: $$f(x+p) = f(x) + \nabla f(x+tp)\cdot p.$$

For the second order expansion, we want to use the following "second order MVT" for real valued functions; If $\Phi$ is $C^2$, then there exists a $t\in [0,1]$ such that $$\Phi''(y+th) = \frac{\Phi(y+h)-\Phi(y) - \Phi'(y)h}{h^2/2}$$ (Maybe you recognize this as the 1D first order Taylor expansion with remainder in Lagrange form.) This readily gives with $y=0$, $h=1$, $$ f(x+p) = f(x) + \nabla f(x) \cdot p + \frac12 \nabla^2 f(x+tp)(p,p)$$ since $\Phi''(t) = (\frac d{dt} \nabla f(x+tp) )\cdot p = \nabla^2 f(x+tp)(p,p)$. Here, $\nabla^2 f(x+tp)(\cdot,\cdot)$ is the bilinear form which written with matrix multiplication using the Hessian matrix $Hf(x+tp)$ would be $$\nabla^2 f(x+tp)(u,v) = u^T Hf(x+tp) v$$