Derivative of Newton iteration in Banach spaces

152 Views Asked by At

I'm studying the Newton's method on Banach spaces and having some trouble to understand its derivative.

Given two Banach spaces $\mathbb{E}, \mathbb{F}$ and a smooth map $f:\mathbb{E} \to \mathbb{F}$, a Newton's iteration of $f$ is the map $N_f:\mathbb{E}\backslash\mathcal{S} \to \mathbb{E}$ such that $$N_f(x) = x - (Df(x))^{-1}f(x),$$ where $\mathcal{S}$ is the set of points $x \in \mathbb{E}$ such that $Df(x)$ is not invertible.

After that, I know that if $x \in \mathbb{E}\backslash\mathcal{S}$ is a zero of $f$, then $DN_f(x) = 0$. I don't have a proof of this fact, they (the people of my book) just show this for complex functions and say it is the same thing for Banach spaces. I tried to prove and doesn't looks the same thing. Here are my thoughts:

I can define the map $D_f^{-1}:\mathbb{E}\backslash\mathcal{S} \to \mathcal{L}(\mathbb{F},\mathbb{E})$ such that $D_f^{-1}(x) = (Df(x))^{-1}$. Doing this I can informally write $$N_f = I - D_f^{-1}\cdot f,$$ where $I$ stands for the identity map. If I just follow what they say and treat $N_f$ as an ordinary complex function, then $$DN_f = I - (-1)D_f^{-2}\cdot D(D_f^{-1})\cdot f - D_f^{-1}\cdot D_f = $$ $$ = I + D_f^{-2}\cdot D(D_f^{-1})\cdot f - I = D_f^{-2}\cdot D(D_f^{-1})\cdot f.$$ Note that I used the chain rule and the product rule in the usual way, which supposedly I'm allowed to do. Now we substitute to get $DN_f(x) = D_f^{-2}(x)\cdot D(D_f^{-1})(x)f(x) = D_f^{-2}(x)\cdot D(D_f^{-1})(x)\cdot 0 = 0$, as claimed.

I have two questions:

1) Are my calculations correct (do they make sense)?

2) What is $D_f^{-2}$ and $D(D_f^{-1})$ ? Even if my calculations are ok, I just don't know what are these maps. I don't know how to interpret it.

Thanks.

1

There are 1 best solutions below

9
On BEST ANSWER

Firstly, the derivative of the inverse is not what you input (see second item below), as you rightly suspect. What you need to solve your problem is the conjunction of the following:

  1. Given a mapping $A:\mathbb{B} \to L(\mathbb{E},\mathbb{F})$ and a mapping $f:\mathbb{B} \to \mathbb{E}$, the derivative of the mapping $\mathrm{ev}_{A,f}: \mathbb{B} \to \mathbb{F}$ given by $\mathrm{ev}_{A,f}(v)=A(v)\cdot f(v)$
  2. The derivative of the inverse mapping on $\mathrm{Aut}(\mathbb{E})$.

The second item is here, but let me put the main result here just for quick reference: The derivative of $X \mapsto X^{-1}$ at $X$ is not $ X^{-2}(\cdot)$, but $-X^{-1} (\cdot) X^{-1}$ instead.

For the first item, we have $$\mathrm{ev}_{A,f}(v+h)=A(v+h)\cdot f(v+h)=(A(v)+A'_v \cdot h+\epsilon_1(h))\cdot(f(v)+f'_v \cdot h+\epsilon_2(h)).$$ After taking care of everything $o(h)$, we have that $(\mathrm{ev}_{A,f})_v'(h)=(A'_v \cdot h)f(v)+(A(v))\cdot (f'_v \cdot h)$

Coupling these together, this should net you that the derivative of $D_f^{-1} \cdot f$ at $x$ is $$h \mapsto (-D_f^{-1}(x)\big(D^{(2)}_f(x)\big)(h)D_f^{-1}(x)) \cdot (f(x))+(D_f(x)^{-1})\cdot(\big(Df(x)\big) \cdot h),$$ where $D^{(2)}$ is the second derivative, not $D\circ D$. Since $f(x)=0$, the left term vanishes, and you are left with the identity mapping, which cancels with the other identity.

EDIT: Actually, you don't even need the derivative of the inverse for your purposes, since the formula for $(\mathrm{ev}_{A,f})'_v$ says preemptively that, since $f(v)=0$, the term which has the derivative of the inverse will vanish (as it indeed does). However, an explicit computation may be useful and also help you as an example.