Given function $$g(u) = (Au)^T(Au)$$ where $u \in \mathbb{R}^{n}$ and $A$ is a matrix of dimension $n \times n$, find the gradient $\nabla g(u)$.
I tried to expand everything and got that
$$\frac{\partial g}{\partial u_k} = 2\sum_{i=1}^{n}(A_{i1}u_1+A_{i2}u_2+\cdots +A_{in}u_n)A_{ik}$$
but I am not sure how to convert this expression into a matrix representation of the gradient of $g$. Any hints will be much appreciated.
Let $n$ be a nonnegative integer and $\mathbf{A}$ be a $n \times n$ real matrix. Also, let $g \, : \, \mathbb{R}^n \, \rightarrow \, \mathbb{R}$ be the function defined by:
$$ \forall \mathbf{u} \in \mathbb{R}^n, \; g(\mathbf{u}) = \mathbf{u}^{\top} \mathbf{A}^{\top} \mathbf{A} \mathbf{u}. $$
We consider $\mathbb{R}^n$ as a Euclidean space equipped with its canonical structure. The inner product of two vectors $\mathbf{u}, \mathbf{v} \in \mathbb{R}^n$ is given by: $\left\langle \mathbf{u}, \mathbf{v} \right\rangle = \mathbf{u}^{\top} \mathbf{v}$. Then, by definition, the gradient $\nabla g(\mathbf{u})$ is the unique vector in $\mathbb{R}^n$ such that : $\forall \mathbf{h} \in \mathbb{R}^n, \; Dg(\mathbf{u})(\mathbf{h}) = \left\langle h, \nabla g(\mathbf{u}) \right\rangle$, where $Dg(\mathbf{u}) \, : \, \mathbb{R}^n \, \rightarrow \, \operatorname{End}\left( \mathbb{R}^n \right)$ is the differential of $g$ at $\mathbf{u}$.
You may compute the differential $Dg(\mathbf{u})$ by expanding $g(\mathbf{u} + \mathbf{h})$. Here, we have:
$$ \forall \mathbf{h} \in \mathbb{R}^n, \; Dg(\mathbf{u})(\mathbf{h}) = \mathbf{h}^{\top} \mathbf{A}^{\top} \mathbf{A} \mathbf{u} + \mathbf{u}^{\top} \mathbf{A}^{\top} \mathbf{A} \mathbf{h}. $$
It follows that:
$$ \forall \mathbf{h} \in \mathbb{R}^n, \; Dg(\mathbf{u})(\mathbf{h}) = \left\langle \mathbf{h}, 2 \mathbf{A}^{\top} \mathbf{A} \mathbf{u} \right\rangle. $$
As a conclusion: