Having trouble understanding a proof after it applies the Hanh Banach theorem.

147 Views Asked by At

I have been reading a proof on the convergence of Newton's method that has been fairly easy to follow except for a single step that has totally mystified me because it suddenly depends on a lot more functional analysis than I know (this is my first time exposed to the Hanh Banach Theorem).

Below is an excerpt of the part which is confusing me.


Given $\lVert F^{\prime}(y) - F^{\prime}(x)\rVert \leq L\lVert x-y\lVert,$

where $L$ is a fixed positive constant and $F :X\rightarrow X$ is a continuously differentiable function from $X$ to $X$ where $X$ is a Banach space. I would like to show that

$$\lVert F(y)-[F(x)+F^{\prime}(x)(y-x)]\rVert\leq \frac{L}{2}\lVert y-x\rVert^{2}.$$

The proof proceeds as follows, define $$y(\theta):= x + \theta(y-x) \quad R(\theta) := F(y(\theta)) - [F(x)+F^{\prime}(x)(y(\theta)-x)]$$

Then by the Hanh-Banach Theorem, there is a $\xi \in X^{*}$ such that $\lVert \xi \rVert =1$ and $\xi(R(1)) = \lVert R(1)\rVert$. Define a function $h(\theta):= \xi(R(\theta))$ so that

$$\frac{dh}{d\theta} = \xi\left(F^{\prime}(y(\theta))-F^{\prime}(x)\right). \quad (1)$$

Then using the assumption:

\begin{equation} \frac{dh}{d\theta}(\theta) \leq L \lVert y(\theta)-x\rVert. \quad(2) \end{equation}


My question then is, why is equation (1) and equation (2) true? First, it seems like we should instead have

$$\frac{dh}{d\theta} = \xi^{\prime}(F^{\prime}(y(\theta)-F^{\prime}(x))y^{\prime}(\theta).$$

Despite reading the Hanh Banach Theorem repeatedly, I do not understand why we actually get $$\frac{dh}{d\theta} = \xi \left(\frac{d}{d\theta}R(\theta)\right).$$ or what happened to $y^{\prime}(\theta)$ by the chain rule. I feel like $y^{\prime}(\theta)$ was just omitted, but I am sure linearity plays a role here so that we do not compute $\xi^{\prime}$, but I have no idea why or how.

Furthermore, even if this were all true, why is

$$\frac{dh}{d\theta}(\theta) = \xi\left(F^{\prime}(y(\theta))-F^{\prime}(x)\right) \leq L \lVert y(\theta)-x\rVert?$$

I don't understand how $\lVert \xi \rVert =1$ helps me make this conclusion by using the hypothesis $$\lVert F^{\prime}(y) - F^{\prime}(x)\rVert \leq L\lVert x-y\lVert$$

To me, the $\xi$ and the lack of a norm on the left hand side gets in the way.

1

There are 1 best solutions below

4
On BEST ANSWER

Note that by linearity $$\eqalign{ \frac{h(\theta+\delta)-h(\theta)}{\delta}&=\frac{1}{\delta}\left(\xi(R(\theta+\delta))- \xi(R(\theta))\right)\cr &=\frac{\xi(R(\theta+\delta)-R(\theta))}{\delta}\cr &=\xi\left(\frac{R(\theta+\delta)-R(\theta)}{\delta}\right)\cr} $$ Taking the limit as $\delta\to0$, we get $$h'(\theta)=\xi(R'(\theta))\tag{$1'$}$$ Now, $y'(\theta)=y-x$ and $$R'(\theta)=F'(y(\theta))y'(\theta)-F'(x)y'(\theta) =(F'(y(\theta))-F'(x))(y-x)$$ So, $(1')$ becomes $$h'(\theta)=\xi((F'(y(\theta))-F'(x))(y-x))\tag{$2'$}$$ Which is the correct alternative to the OP's relation $(1)$.

Now, $\Vert\xi\Vert=1$, implies that $|\xi(v)|\le \Vert v\Vert$ for every $v$, so $$|h'(\theta)|\le \Vert (F'(y(\theta))-F'(x))(y-x))\Vert \le\Vert F'(y(\theta))-F'(x)\Vert\cdot\Vert y-x\Vert $$ Finally $$|h'(\theta)|\le L \Vert y(\theta))-x\Vert\cdot\Vert y-x\Vert=L\theta \Vert y-x\Vert^2 $$ Integrating, we get $$\Vert R(1)\Vert=|h(1)-h(0)|\le\int_0^1|h'(\theta)|d\theta\le \frac{L}{2}\Vert y-x\Vert^2$$ which is the desired inequality.