Derivative of matrix vector product

1.1k Views Asked by At

Assuming I have a vector $\mathbf{e} = \mathbf{Lx}$, with $\mathbf{e} \in \mathbb{R}^m$, $\mathbf{L} \in \mathbb{R}^{m \times m}$, $\mathbf{x} \in \mathbb{R}^m$, and I want to take the derivative with respect to a third vector $\boldsymbol{\theta} \in \mathbb{R}^p$.

Both $\mathbf{L} = f(\boldsymbol{\theta})$ and $\mathbf{x} = f(\boldsymbol{\theta})$, so the derivative with application to the product rule is:

$$ \frac{d\mathbf{e}}{d\boldsymbol{\theta}} = \frac{d\mathbf{L}}{d\boldsymbol{\theta}} \mathbf{x} + \mathbf{L} \frac{d\mathbf{x}}{d\boldsymbol{\theta}}. $$

The jacobian $\frac{d\mathbf{x}}{d\boldsymbol{\theta}} \in \mathbb{R}^{m \times p}$ left multiplied with $\mathbf{L}$ results correctly in a $n \times p$ matrix for the final jacobian.

My question now is: what does $\frac{d\mathbf{L}}{d\boldsymbol{\theta}}$ look like. I found a some posts and articles that use the vectorization function $$ \frac{\mathrm{d}\operatorname{vech}\left(\mathbf{L}\right)}{\mathrm{d}\operatorname{vech}\left(\mathbf{A}\right)} $$

($\mathbf{A} = g(\boldsymbol{\theta})$ is an intermediary result that I use)

but I don't know what kind of tensor would have a form that can actually produce the correctly shaped jacobian of the final result $m \times p$. As far as I can see, the right multiplication of the vector $\mathbf{x} \in \mathbb{R}^{p \times 1}$ always produces a columns vector.

1

There are 1 best solutions below

1
On BEST ANSWER

If you need to deal with tensors I recommend using some sort of index notation:

$$ \begin{align} \left[\frac{\partial\mathbf e}{\partial\boldsymbol\theta}\right]_{ij} &= \frac{\partial e_i}{\partial\theta_j} \\ &= \frac{\partial}{\partial\theta_j}\left[\mathbf{Lx}\right]_i \\ &= \frac{\partial}{\partial\theta_j}\left(L_{ik}x_k\right) \\ &= \frac{\partial L_{ik}}{\partial\theta_j}x_k + L_{ik}\frac{\partial x_k}{\partial\theta_j}. \end{align} $$

(Note the convention that repeated indices imply summation.)