Taylor Expansion with Remainder and Norm

188 Views Asked by At

In Freund's Nonlinear Programming lecture notes (pg 26), there is an expression for directional derivatives in the form of

$$ f(\bar{x}+\lambda d) = f(\bar{x}) + \lambda \nabla f(\bar{x})^T d + \lambda ||d|| \alpha(\bar{x},\lambda d) $$

where $\alpha(\bar{x},\lambda d) \to 0$ as $\lambda \to 0$. This is from the definition for $f(\cdot)$ at $\bar{x}$ being differentiable if there exists a vector $\nabla f(\bar{x})$ such that the above holds.

Another definition using the Hessian $H$,

$$ f(\bar{x}+\lambda d) = f(\bar{x}) + \lambda \nabla f(\bar{x})^T d + \frac{1}{2} \lambda^2 d^T H(\bar{x}) d + \lambda^2 ||d||^2 \alpha(\bar{x},\lambda d) $$

where, again, $\alpha(\bar{x},\lambda d) \to 0$ as $\lambda \to 0$.

I understand these are merely Taylor expansions utilizing first and the other using second order terms. But the remainder terms looks odd, I've never seen usage like this. Why the norm for $d$? Where does $\alpha$ come from? How can one derive these expressions?

Thanks,

1

There are 1 best solutions below

0
On BEST ANSWER

As pointed out by Calvin, the $|d|$ comes from the fact that the directional derivative is in $\mathbb{R}^{n}$. Let me say some words here.

If $U$ is an open subset of $\mathbb{R}^{n}$ and $f: U \to \mathbb{R}$ has all its $n$ partial derivatives over $U$, we say $f$ is of class $C^{1}$. Now, let's assume $f$ is $C^{1}$ and take $U \subset \mathbb{R}^{2}$ for simplicity. Fix $\bar{x} = (\bar{x_{1}},\bar{x_{2}}) \in U$ and take $v = (h,k)$ such that $\bar{x}+v \in B \subset U$, where $B$ is an open ball with center $\bar{x}$. Define $r(v) = r(h,k)$ by $$r(v) = f(\bar{x_{1}}+h,\bar{x_{2}}+k)-f(\bar{x_{1}},\bar{x_{2}})-\frac{\partial f}{\partial x}h-\frac{\partial f}{\partial y}k.$$ Now, we can write $$ r(v) = f(\bar{x_{1}}+h,\bar{x_{2}}+k)-f(\bar{x_{1}},\bar{x_{2}}+k)+f(\bar{x_{1}},\bar{x_{2}}+k)-f(\bar{x_{1}},\bar{x_{2}}) - \frac{\partial f}{\partial x}h - \frac{\partial f}{\partial y}k $$ Now, we can use the Mean Value Theorem for real functions to find $\theta_{1},\theta_{2} \in (0,1)$ such that $$r(v) = \frac{\partial f}{\partial x}(\bar{x_{1}}+\theta_{1}h,\bar{x_{2}}+k)h +\frac{\partial f}{\partial y}(\bar{x_{1}},\bar{x_{2}}+\theta_{2}k)k -\frac{\partial f}{\partial x}h - \frac{\partial f}{\partial y}k $$ Thus $$ \frac{r(v)}{|v|} = \bigg{[} \frac{\partial f}{\partial x}(\bar{x_{1}}+\theta_{1}h,\bar{x_{2}}+k)-\frac{\partial f}{\partial x}(\bar{x_{1}},\bar{x_{2}})\bigg{]}\frac{h}{\sqrt{h^{2}+k^{2}}} + \bigg{[}\frac{\partial f}{\partial y}(\bar{x_{1}},\bar{x_{2}}+\theta_{2}k)-\frac{\partial f}{\partial y}(\bar{x_{1}},\bar{x_{2}})\bigg{]}\frac{k}{\sqrt{h^{2}+k^{2}}}.$$ If we take $v \to 0$, the continuity of the derivatives implies that the terms inside both $[\cdots]$ go to zero and because $h/\sqrt{h^{2}+k^{2}} \le 1$ and, $k/\sqrt{h^{2}+k^{2}} \le 1$, we readily see that $r(v)/|v| \to 0$.