Notations in the proof that $\forall x \in \mathbb{R}^d f(x+h) = f(x) + \langle\nabla f(x), h\rangle + o(\|\textbf{h}\|)$

22 Views Asked by At

Check that the gradient does fulfill the following fundamental property

$$ \forall x \in \mathbb{R}^d f(x+h) = f(x) + \langle\nabla f(x), h\rangle + o(\|\textbf{h}\|)^{*1} (\textbf{h}\in \mathbb{R}^d) $$ Let $x\in \mathbb{R}^d, h\in \mathbb{R}^d, h\ne O_d^{*2}$ \begin{align*} f(x+h) &= \frac{1}{2}(x+h)^T A(x+h) + b^T(x+h) \\ &=\underbrace{\frac{1}{2}x^T Ax + b^Tx}_{f(x)} + \underbrace{\frac{x^TAh}{2}+\frac{h^TAx}{2}}_{\frac{(x^T Ah)^T+h^TAx}{2}}+\underbrace{b^Th}_{h^Tb} + \frac{1}{2}h^TAh\\ &\text{and we can write: }\frac{x^TAh}{2}+\frac{h^TAx}{2} + h^Tb = \underbrace{h^T\big[(A^T + A)x + b\big]}_{\langle h, \nabla f(x) \rangle} \end{align*}

We need to prove that: $\frac{1}{2}h^T Ah = o(\|h\|) \Leftrightarrow \exists \epsilon(h) \xrightarrow[h\rightarrow 0]{} 0^{*3}$, $\|h\|\epsilon (h) = \frac{h^TAh}{2}$ $$ \frac{1}{2}h^TAh = \frac{\|h\|^2}{2}\big(\frac{h}{\|h\|}\big)^T A\big(\frac{h}{\|h\|}\big) =\underbrace{ \|h\|\big[\frac{\|h\|}{2}\big(\frac{h}{\|h\|}\big)^T A\big(\frac{h}{\|h\|}\big)\big]}_{\epsilon(h)} $$

and $\big(\frac{h}{\|h\|}\big)^T A\big(\frac{h}{\|h\|}\big)$ is bounded for all $h$ by $|||A|||^{*4} = \max_{x\in \mathbb{R}^d, \|x\|=1} x^T A x$ $$ \epsilon (h) = \|h\|\cdot\big|\frac{1}{2}\big(\frac{h}{\|h\|}\big)^T A \big(\frac{h}{\|h\|}\big)\big| \leq \|h\| \cdot |||A||| $$ Therefore $\frac{1}{2}h^TAh = \|h\|\cdot \epsilon(h) \xrightarrow[h\rightarrow 0]{} 0= o(\|h\|)$

$\forall x \in \mathbb{R}^d f(x+h) = f(x) + \langle\nabla f(x), h\rangle + o(\|\textbf{h}\|)$

I do the following notations mean (noted by $^{*n}$):

  • $o(\|h\|)$
  • $O_d$
  • $\epsilon(h) \xrightarrow[h\rightarrow 0]{} 0$
  • $|||A|||$