Trouble in finding equation with matrices

55 Views Asked by At

Let $\;f:\mathbb R^n \rightarrow \mathbb R^m\;$ and $\;W:\mathbb R^m \rightarrow \mathbb R_{+}\;$ two functions.

We'll denote $\;f_{x_i}=(f^1_{x_i},\dots,f^m_{x_i})\;\;\forall 1\le i\le n\;$ where $\;f^j_{x_i}=\frac{\partial f_j}{\partial x_i}\;\;\forall 1\le i\le n,1\le j\le m\;$.

Furthermore assume the $\;n \times n\;$ matrix $\;A=(a_{ij})_{1\le i,j\le n}\;$ with $\;a_{ij}=f_{x_i}\cdot f_{x_j} -{\delta}_{ij}(\frac {1}{2} {\vert \nabla f \vert}^2+W(f))\;$ where $\;\cdot \;$ stands for the Euclidean inner product and $\;\vert \cdot \vert \;$ is the Euclidean norm of the matrix.

Prove that $\;A+((\frac {1}{2} {\vert \nabla f \vert}^2+W(f))I=(\nabla f)^T(\nabla f)$

I'm pretty sure this "exercise" is quite easy but I miss something here. It is obvious that is sufficient to show $\;(\nabla f)^T(\nabla f)\;$ follows from $\;f_{x_i}\cdot f_{x_j}\;$ but I have no clue how to proceed.

With the above notation, all I can see is $\;(\nabla f)^T(\nabla f)=\begin{pmatrix} f_{x_1}\\ \;\cdot\\ \;\cdot\\ f_{x_n}\\ \end{pmatrix} ({f_{x_1}}^T, \dots, {f_{x_n}}^T)\;$

Does this somehow connect to $\;f_{x_i}\cdot f_{x_j}\;$?

Maybe what I'm asking is trivial, but I've no broad experience with matrices so I apologize in advance!

Any help would be valuable. Hints are also welcome!

Thanks

1

There are 1 best solutions below

2
On BEST ANSWER

The distinction between whether the gradient is a row or a column vector, as well as the shape of the Jacobian, is sometimes confusing. See:

However, usually its relatively immaterial. Note also that $ u^Tv=u\cdot v $ for vectors $u,v$.

Let $W:\mathbb R^m \rightarrow \mathbb R_{+}$, $ f:\mathbb{R}^n→\mathbb{R}^m $, $A\in\mathbb{R}^{n\times n}$. I'll write $ f_{x_i}=\partial_i f=(\partial_i f_1 ,\ldots, \partial_i f_n) $.

I would write the "gradient" here as a row vector of column vectors: $$ \nabla f = [\partial_1 f,\ldots, \partial_n f] = \begin{bmatrix} \partial_1 f_1 & \cdots & \partial_n f_1\\ \vdots & \ddots & \vdots \\ \partial_1 f_m & \cdots & \partial_n f _m \end{bmatrix} \in\mathbb{R}^{m\times n} $$

We then get $ (\nabla f)^T \in \mathbb{R}^{n\times m} $, so $ (\nabla f)^T(\nabla f) \in \mathbb{R}^{n\times n} $. We can then see the following: \begin{align} (\nabla f)^T(\nabla f) &= \begin{bmatrix}(\partial_1 f)^T \\ \vdots \\ (\partial_n f)^T\end{bmatrix}[\partial_1 f,\ldots, \partial_n f]\\[2mm] &= \begin{bmatrix} (\partial_1 f)^T\partial_1 f & \cdots & (\partial_1 f)^T\partial_n f\\ \vdots & \ddots & \vdots \\ (\partial_n f)^T\partial_1 f & \cdots & (\partial_n f)^T\partial_n f \end{bmatrix} \\[2mm] &= \begin{bmatrix} \partial_1 f\cdot\partial_1 f & \cdots & \partial_1 f\cdot \partial_n f\\ \vdots & \ddots & \vdots \\ \partial_n f\cdot\partial_1 f & \cdots & \partial_n f\cdot\partial_n f \end{bmatrix} \end{align} so the matrix has components $[(\nabla f)^T(\nabla f)]_{ij}=\partial_i f\cdot \partial_j f $. Let $c=|\nabla f|^2/2 + W(f)$.

So then we can write: $$ A + cI = \begin{bmatrix} \partial_1 f\cdot\partial_1 f - c +c & \cdots & \partial_1 f\cdot \partial_n f\\ \vdots & \ddots & \vdots \\ \partial_n f\cdot\partial_1 f & \cdots & \partial_n f\cdot\partial_n f -c+c \end{bmatrix}=(\nabla f)^T(\nabla f) $$