Flattening matrix derivatives

233 Views Asked by At

How do you turn a matrix derivative into a corresponding vector derivative? More concretely, I am working on a problem where I am evaluating the derivative of the $f:\mathbb R^{p\times p} \to \mathbb R^p$, $f(X) = X^{-1}b$. We can write $h(X) = X^{-1}$ and $g(X) = Xb$ so that $f(X) = (g \circ h)(X)$, so using chain rule, $Df(X) = Dg \circ Dh$, so $$Df: U \mapsto X^{-1}UX^{-1} b$$ However, for a downstream application, I would like to put this derivative in matrix form by identifying it with a linear transformation $\mathbb R^{p^2} \to \mathbb R^p$, or in other words, with a matrix in $\mathbb R^{p \times p^2}$. Is there a standard way to deal with problems of this sort (or maybe there's no standard way but this particular problem is well structured enough to have a solution)? What resources should I consult? Edit: As pointed out in one of the answers, a naive approach would be to apply the transformation to each of the "basis" vectors of the form $E_{ij}$ which has only a 1 in the $ij^{th}$ entry. However, such an approach is computationally expensive and would require a lot of matrix multiplications, so I am more trying to figure out whether there is a nicer formula that this can be reduced to.

1

There are 1 best solutions below

2
On

I would proceed like that:

  • Consider a basis of the matrices $\mathbb R^{p \times p}$. This is pretty easy. Take for example $(E_{ij})_{\substack{1 \le i \le p \\ 1 \le j \le p}}$ where $E_{ij}$ is the matrix having all coefficients vanishing except the one at $i$th-row and $j$th-column which is equal to one.
  • $Df(E_{ij})$ is a vector $v_{ij} \in \mathbb R^p$.
  • Define the coefficient $(l,(i,j))$ of your expected matrix $M \in \mathbb R^{p \times p^2}$ to be the $l$th-coordinate of $Df(E_{ij})$.