Let $(H,\langle \cdot, \cdot \rangle)$ be a real pre-Hilbert space (a real vector space equipped with a scalar product). Show that the function $$ f:\textbf{x}\in H \rightarrow f(\textbf{x}) = \|\textbf{x}\| \in [0,+\infty) $$ is differentiable on $H \backslash\{0\}$ and that its differential is then given by $$ Df(x,h) = \Big\langle \frac{x}{\|x\|},h \Big\rangle $$ My answer attempt \begin{align} Df(x,h) &= D\|(\textbf{x},\textbf{h})\|\\ &= D\|\sum x_i h_i\|\\ &= D\sqrt{(\sum x_i h_i)^2} \end{align} where
- $f_1(x) = \sqrt{x} \rightarrow f_1'(x) = \frac{1}{2}(x)^{-\frac{1}{2}}$ (1)
- $f_2(x) = x^2 \rightarrow f_2'(x) =2x$ (2)
- $f_3(x) = \sum x_i h_i \rightarrow \nabla f_3(\textbf{x}) = \textbf{h}$ (3)
Then $$ (f_1(f_2(f_3)))' $$ we can then say \begin{align*} f_{23} &= f_2(f_3) \rightarrow f_{23}' = f_2'(f_3)\cdot \nabla f_3\\ f_{23} &= \langle \textbf{x}, \textbf{h}\rangle^2 \rightarrow 2 \langle \textbf{x}, \textbf{h}\rangle \textbf{h} \end{align*} hence: $$ (f_1(f_2(f_3)))' = f_1(f_{23})' = f_1'(f_{23})\cdot f_{23}' $$ hence: \begin{align*} D\sqrt{(\sum x_i h_i)^2} &= \frac{1}{2}(\langle \textbf{x}, \textbf{h} \rangle^2)^{-\frac{1}{2}} \cdot 2 \langle \textbf{x}, \textbf{h} \rangle \textbf{h}\\ &= \frac{\langle \textbf{x}, \textbf{h} \rangle}{\sqrt{\langle \textbf{x}, \textbf{h} \rangle^2}} \textbf{h}\\ &= \frac{\langle \textbf{x}, \textbf{h} \rangle}{\|\langle \textbf{x}, \textbf{h} \rangle\|} \textbf{h} \quad \textbf{(4)}\\ &= \langle \frac{\textbf{x}}{\|\textbf{x}\|} \rangle \textbf{h} \quad \textbf{(5)} \end{align*}
Here are my questions:
- I treated the function (1) and (2) as mapping from $\mathbb{R} \to \mathbb{R}$ unlike (3) which maps from $\mathbb{R}^n \to \mathbb{R}$ Is it correct?
- What is the missing step that allows to go from (4) to (5)?
- Please also fill free to mention any unprecision in the notations
There are several issues with what you have written. I think the biggest issue is you haven't understood your own notation when it comes to $Df(x,h)$, because you wrote
I think the notation $Df(x,h)$ is pretty misleading. I prefer to use $Df_x(h)$ or $df_x(h)$. Here's how to "read" this notation: $f$ is a mapping from $H$ into $\Bbb{R}$. Now, if you fix a point $x \in H \setminus \{0\}$, then the differential of $f$ at the point $x$ is denoted by the symbol $df_x$. Note that by definition, $df_x$ itself is a bounded linear transformation from $H$ into $\Bbb{R}$.... i.e $df_x \in \mathcal{L}(H, \Bbb{R}) =: H^*$. Since $df_x: H \to \Bbb{R}$ is a linear transformation, it can "eat a vector in $H$", so if $h \in H$ then $df_x(h)$ means the linear transformation $df_x$ evaluated on the vector $h$. Now finally, $df_x(h) \in \Bbb{R}$. There's a lot of things going on here, and you need to know what each object means, and which space it lives in.
What you've proven is something completely different to what was being asked of you, hence your answer is wrong (see my first bullet point; your first equal sign was wrong, hence everything else is wrong). The correct solution however, is obtained by writing $f$ as a composition of the square root function and the inner product, and then apply the chain rule (you also tried this, but you did it incorrectly). To see exactly how, define the following maps temporarily: