Derivative Calculation of vector and matrix with transpose

308 Views Asked by At

How to calculate the derivative of $f(x)= v^T(Ax)$? I have seen an equation in which it is written as $f^{'}(x)=A^Tv$. I am not understanding that why the transpose is interchanged. Here v is a vector$ R^{p*1}$ and A is a matrix $R^{p*n}$

2

There are 2 best solutions below

0
On

$\newcommand{\R}{\Bbb{R}}$ Use the following fact:

If $f(x) = cx$ where $c\in \R^{1\times n}$ is a constant row vector and $x \in\R^n$, then $f'(x) = c^T \in \R^n$.

In your case, $c = v^TA$, so the derivative is $c^T = (v^TA)^T = A^Tv$.


To see why that fact is true, note that $f'(x)$ refers to the column vector whose $i$-th component is $\frac{\partial f}{\partial x_i}$ for all $i=1,\ldots,n$ (and $x_i$ is the $i$-th component of $x$). Write $c = \begin{bmatrix}c_1 & c_2 & \cdots & c_n\end{bmatrix}$ and $x = \begin{bmatrix}x _1 \\ x_2 \\ \vdots \\ x_n\end{bmatrix}$. We have $f(x) = cx = c_1x_1 + c_2 x_2 + \cdots +c_n x_n$, so $\frac{\partial f}{\partial x_i} = c_i$. Hence for each $i=1,\ldots,n$, the $i$-th component of $f'(x)$ is $c_i$. This means that $f'(x)$ (written as a column vector) is just $\begin{bmatrix}c _1 \\ c_2 \\ \vdots \\ c_n\end{bmatrix} = c^T$.

0
On

The transposition is a matter of convention. Usually people write the gradient as a column vector. $v^TA$ will have only one row, so it is a row vector. $A^Tv$ will have the same elements, but as a column vector. The convention is important only because we must be consistent. This is discussed (at a somewhat advanced level) in the Wikipedia page on matrix calculus:

https://en.wikipedia.org/wiki/Matrix_calculus#Layout_conventions