Derivatives of Matrices and Vectors

67 Views Asked by At

I am currently studying deep learning and a lot of the calculus involving differentiating products or sums of ill defined operations on matrices and vectors is very confusing.

For instance, take this example:

$X$ is an $N \times D$ matrix, and let $b$ be a $D$ dimensional row. Now, a sum such as $X + b$ itself isn't generally defined, but fine, we add the appropriate entries of $b$ with elements of columns of $X$ for each row.

So, let $A = X + b$ as defined above. Let $dX$ also be an $N \times D$ sized matrix.

I want to find the product $$\frac{dA}{db} . dX$$ of size $D \times 1$ or the other way, if the order isn't correct.

So, I want to ask, is there some reference or book I can read to make sense of such sums and products rather than mostly relying on intuition?

1

There are 1 best solutions below

0
On

Let $1\in {\mathbb R}^{N},\,$ be a vector of ones of length $N$.

Then you can express $A$ and its differential as $$\eqalign{ A &= X + 1\,b^T \cr dA &= dX + 1\,db^T \cr }$$ Since $\,\frac{\partial b^T}{\partial b}=I,\,$ (the identity matrix) the gradient of $A$ wrt $b$ is the third-order tensor $$\eqalign{ \frac{\partial A}{\partial b} &= 1\,I \cr }$$ and the product of this tensor with $X$ is another third-order tensor $$\eqalign{ \frac{\partial A}{\partial b}\,dX &= 1\,dX \cr }$$