Check the differentiability of the given function.

535 Views Asked by At

Let $M_n(\mathbb{R})$ denote the space of all $n\times n $ real matrices identified with Euclidean space $\mathbb{R^{n^2}}$. Fixed a column vector $x \neq 0$ in $\mathbb{ R^n}$. Define $f : M_n(\mathbb{R}) \rightarrow \mathbb{R}$ by $f(A) = \langle A^2x,x \rangle$. Check whether given function is differntiable or not?

When I took $ A= \left[ {\begin{array}{cc} x_1 & x_2 \\ x_3 & x_4\\ \end{array} } \right] $ and $x=\left[ {\begin{array}{cc} a \\ b\\ \end{array} } \right]$. I got $f(A)$ as a polynomial of four variables. I know that the polynomial function is always differentiable. How do I prove it for $n\times n$ matrix case?Without expanding the inner product How do I prove the given function is differentiable?

2

There are 2 best solutions below

7
On BEST ANSWER

The first step is to have differentiability definition in mind. From wikipedia

A function of several real variables $f: R^m → R^n$ is said to be differentiable at a point $x_0$ if there exists a linear map $J: R^m → R^n$ such that

$$ \lim_{h\to 0}\frac{\|f(x_0+h)-f(x_0)-Jh\|}{\|h\|}=0 $$


Now your application. Take any matrix $H$ and do the following computation:

\begin{align*} f(A+H)=&\langle (A+H)^2x,x \rangle \\ =&\langle (A^2+AH+HA+H^2)x,x \rangle \\ =&\langle A^2x,x \rangle+\langle (AH+HA)x,x \rangle+\langle H^2x,x \rangle \end{align*}

Now consider: \begin{align*} \mathcal{l}=\lim_{H\to 0}\frac{\|f(A+H)-f(A)-\langle (AH+HA)x,x \rangle\|}{\|H\|} \end{align*}

From the first computation this is also equal to: \begin{align*} \mathcal{l}=\lim_{H\to 0}\frac{\|\langle H^2x,x \rangle\|}{\|H\|} \end{align*}

However like $$ \frac{\|\langle H^2x,x \rangle\|}{\|H\|}\le\frac{\|H^2x\|\|x\|}{\|H\|}\le\|H\|\|x\| $$ where we have used Cauchy-Schwarz and $\|H^2x\|\le\|H^2\|\|x\|$ (we assume that we have taken a matrix norm consistent with the vector norm)

it is clear that the limit $\mathcal{l}$ is zero: \begin{align*} 0\le \mathcal{l}=\lim_{H\to 0}\frac{\|\langle H^2x,x \rangle\|}{\|H\|}\le\lim_{H\to 0}\|H\|\|x\|=0 \end{align*}

This means that the linear application: $$ H\mapsto df(A)\cdot H =\langle (AH+HA)x,x \rangle $$ is your differential at point $A$


Answer to comments. To find the differential, I have proceeded by direct identification after algebra manipulation of $f(A+H)$: $$ f(A+H)=\langle (A+H)^2x,x \rangle=\underbrace{\langle A^2x,x \rangle}_{f(A)}+\underbrace{\langle (AH+HA)x,x \rangle}_{df(A)\cdot H}+\underbrace{\langle H^2x,x \rangle}_{\text{reminder that vanishes when H}\to 0} $$

This direct approach was possible because your example involved only matrix/vector products, scalar products etc.

Sometimes expressions are more complex. You must first find a candidate for the differential (you can use partial derivatives, chain rule etc...) then, in case of doubt, you must prove that the limit (the first equation of my post, from wikipedia) exists.

An useful result is that existence of the previous limit is equivalent to the continuity of the partial derivatives in a neighborhood of $A$.

Example: $f:\mathbb{R}^n\ni x\to\|x\|_2=\sqrt{\sum_i x_i^2}$.

A candidate for the differential at point $a\in\mathbb{R}^n$ is: $$ df(a)\cdot h=\sum_{i=1}^n \frac{\partial f}{\partial x_i}(a)h_i=\frac{1}{\|a\|_2}\sum_{i=1}^n a_ih_i $$ It is clear that the function $x\to\|x\|_2$ is differentiable at any point $a\in\mathbb{R}^n-\{0\}$. However the point $a=0_{\mathbb{R}^n}$ is suspicious. To prove differentiability we must prove that in a neighborhood of $0_{\mathbb{R}^n}$ all partial derivatives are continuous. $$ \frac{\partial f}{\partial x_i}(a)=\frac{a_i}{\|a\|_2} $$

Unfortunately these functions are not continuous at $a=0_{\mathbb{R}^n}$. To see that, we will find two different paths approaching $0_{\mathbb{R}^n}$ but with two different values for the function (thus the function is not continuous).

Define a curve $\gamma_i:t\in\mathbb{R}\to \gamma_i(t):=t \mathbf{e}_i$, observe that $\gamma_i(0)=0_{\mathbb{R}^n}$

Now observe that: $$ \frac{\partial f}{\partial x_i}(\gamma_i(t))=\frac{t}{\sqrt{t^2}}=\left\{\begin{array}{rl}+1, t>0 \\ -1, t<0\end{array}\right. $$ clearly the partial derivative is not continuous, hence the function $x\to\|x\|_2$ is not differentiable at $a=0$ (but it is differentiable at any other point of $\mathbb{R}^n$).

Concerning good reference: I personally really like Henri Cartan book Differential Calculus On Normed Spaces a wonderful book to learn differential calculus. However everything in done in Banach spaces, not sure it is the right choice for a primer book on the subject. At a lower level I have no suggestion for the moment sorry.


One last thing I also answered this question which was quite similar to ours. Emphasis is done on the difference between the differential $df$ and the gradient $\nabla f$.

0
On

Differentiability of a matrix function can be broken to mean differentiability in individual varibles(column vectors) using kronecker product. See [here]. So since the given inner product corresponds to certain vector-matrix multiplication, it can be seen to be differentiable since products usually are. Another way to see differentiability is by observing that the inner product corresponds to a quadratic form, which usually a homogenous polynomial in $n$ variables. Again, differentiability would be easy. Yet another way is to prove by induction considering the linearity of the inner product and its symmetry, as in the case of determinant or trace. See here and here for additional links