Vector matrix vector multiplication derivative

119 Views Asked by At

I came across the following derivation:

$$ \frac{\partial}{\partial\vec{u}}\left(\vec{u}^T\mathbf{A}^T\mathbf{A}\vec{u}\right) = 2\mathbf{A}^T\mathbf{A}\vec{u} $$ where $\mathbf{A}$ is a 3 x n matrix and $\vec{u}$ a 1 x 3 vector.

I am aware that $\frac{\partial}{\partial\vec{u}}\left(\vec{u}^T\vec{u}\right) = 2\vec{u}$, but I cannot figure which trick is used to get the matrix out of the way since commutation does not apply. Thanks

1

There are 1 best solutions below

0
On

For any $n \times n$ matrix $M$ we have $$ \vec{u}^T M \vec{u} = \sum_{i=1}^n \sum_{j=1}^n M_{i,j} u_i u_j. $$ Thus, for some $k$ we find that

\begin{align} \frac{\partial}{\partial u_k} \left(\vec{u}^T M \vec{u}\right) &= \sum_{\substack{j=1 \\ j \neq k}}^n M_{k,j} u_j + \sum_{\substack{i=1 \\ i \neq k}}^n M_{i,k} u_i + 2 M_{k,k} u_k \\ &= \sum_{j=1}^n M_{k,j} u_j + \sum_{i=1}^n M_{i,k} u_i. \end{align}

If we denote the $k$th row of $M$ as $\vec{r}_k^T$ and the $k$th column of $M$ as $\vec{c}_k$, then we have just shown that $$ \frac{\partial}{\partial u_k} \left(\vec{u}^T M \vec{u}\right) = \vec{r}_k^T \vec{u} + \vec{c}_k^T \vec{u}. $$ Going through all $k = 1, \dots, n$ we find that $$ \frac{\partial}{\partial \vec{u}} \left(\vec{u}^T M \vec{u}\right) = \begin{pmatrix} \vec{r}_1^T \vec{u} \\ \vdots \\ \vec{r}_n^T \vec{u} \end{pmatrix} + \begin{pmatrix} \vec{c}_1^T \vec{u} \\ \vdots \\ \vec{c}_n^T \vec{u} \end{pmatrix} = \begin{pmatrix} \vec{r}_1^T \\ \vdots \\ \vec{r}_n^T \end{pmatrix} \vec{u} + \begin{pmatrix} \vec{c}_1^T \\ \vdots \\ \vec{c}_n^T \end{pmatrix} \vec{u} = M \vec{u} + M^T \vec{u}. $$

In your case, $M = A^T A$ is symmetric, and thus we find that $$ \frac{\partial}{\partial \vec{u}} \left(\vec{u}^T A^T A \vec{u}\right) = 2 A^T A \vec{u}. $$