Special case of chain rule

105 Views Asked by At

Suppose $H$ is a Hilbert space, $I:H \to \mathbb{R}$ is a functional and $\eta_t:\mathbb{R} \to H$.

I want to understand why

\begin{equation*} \frac{d}{d t} I\left(\eta_t\right)=\left(I^\prime\left[\eta_t\right],\frac{d}{d t} (\eta_t)\right), \end{equation*}

where $(.,.)$ is the inner product of $H$. I know that the formula is true when $H$ is a finite dimensional space (for example, $\mathbb{R}^n$), for then you can write coordinates $x_1,\dots, x_n$ and with the chain rule the inner product naturally appears. My question is how to show this formula is true when $H$ is an arbitrary Hilbert space.

1

There are 1 best solutions below

2
On BEST ANSWER

To sum up my comments above: first, this answer here Frechet derivative chain rule should provide sufficient explanation about Fréchet derivatives and the chain rule for our context, so I'll just extract the relevant information.
Given $X$ and $Y$ two normed spaces (usually Banach spaces but they don't have to be) the Fréchet derivative of a function $f : X \to Y$ at a point $x$, which I'll denote by $df(x)$, is defined (when it exists) as the unique bounded linear functional such that: $$f(a+h)=f(a)+df(x)(h)+\underset{\|h\| \to 0}{o}(\|h\|)$$ And then you can prove the following theorem, called the chain rule:

Let $X$, $Y$ and $Z$ be three normed spaces, $a \in X$, $U$ a neighbourhood of $a$ in $X$ and $V$ a neighbourhood of $f(a)$ in $Y$, and finally $f: U \subset X \to Y$ and $g : V \to Z$ be two functions which are Fréchet-differentiable in neighbourhoods of respectively $a$ and $f(a)$.
Then, $g \circ f$ is Fréchet-differentiable in a neighbourhood of $a$ in $X$, and: $$d(g \circ f)(a) = dg(f(a)) \circ df(a)$$

If we let $U = X = Z := \mathbb{R}$ and $V = Y := H$, and $f := \eta$ and $g := I$, then we are in this exact situation, meaning that we have: $$\forall t \in \mathbb{R},\quad d(I \circ \eta)(t) = dI(\eta(t)) \circ d\eta(t)$$ Now, $\eta$ is a function which starts from $\mathbb{R}$, and so $d\eta$ will always be of the form $h \mapsto h w$ for some unique vector $w \in H$, thus it is customary to define $\eta'(t) := w$.
Similarly, $d(I \circ \eta)(t) : \mathbb{R} \to \mathbb{r}$ is of the form $d(I \circ \eta)(t) : h \mapsto (I \circ \eta)'(t) h$, this time $(I \circ \eta)'(t)$ is the standard notion of derivative on the reals and $\cdot$ is the usual scalar multiplication.
Moreover, $dI(x) : H \to \mathbb{R}$ is a bounded linear functional, thus by the Riesz representation theorem there exists a unique vector $v \in H$ such that: $$\forall u \in H,\quad dI(x)(u) = (v, u)$$ I'm assuming that $v$ is thus the vector you called $I'(x)$.

Putting these notations in the chain rule and evaluating at $h \in \mathbb{R}$, we obtain: $$\begin{split} \forall t \in \mathbb{R}, \forall h \in \mathbb{R},\quad (I \circ \eta)'(t) \cdot h &= d(I \circ \eta)(t)(h)\\ &= dI(\eta(t))(d\eta(t)(h))\\ &= dI(\eta(t))(\eta'(t) h)\\ &= \Big(I'(\eta(t)), \eta'(t) h\Big)\\ &= \Big(I'(\eta(t)), \eta'(t) \Big) \cdot h \end{split}$$ Hence, taking $h = 1$: $$(I \circ \eta)'(t) = \Big(I'(\eta(t)), \eta'(t) \Big)$$ as desired.