Derivative of a Matrix w.r.t. its Matrix Square, $\frac{\partial \text{vec}X}{\partial\text{vec}(XX')}$

73 Views Asked by At

Let $X$ be a nonsingular square matrix.

What is $$ \frac{\partial \text{vec}X}{\partial\text{vec}(XX')}, $$ where the vec operator stacks all columns of a matrix in a single column vector?

It is easy to derive that $$ \frac{\partial\text{vec}(XX')}{\partial \text{vec}X} = (I + K)(X \otimes I), $$ where $K$ is the commutation matrix that is defined by $$ \text{vec}(X) = K\text{vec}(X'). $$

Now $(I + K)(X \otimes I)$ is a singular matrix, so that the intuitive solution $$ \frac{\partial \text{vec}X}{\partial\text{vec}(XX')} = \left( \frac{\partial\text{vec}(XX')}{\partial \text{vec}X} \right)^{-1} $$ does not work.

Is the solution simply the Moore-Penrose inverse of $(I + K)(X \otimes I)$, or is it more complicated?

1

There are 1 best solutions below

1
On BEST ANSWER

Consider the following QR, SVD, and Cholesky factorizations. $$\eqalign{ X^T &= QR \quad&\implies\quad A = R^T \\ X &= U\Sigma V^T \quad&\implies\quad B = U\Sigma \\ XX^T &= LL^T \\ }$$ Thus there are (at least) four ways to write the product
$$\eqalign{ P = XX^T = AA^T = BB^T = LL^T \\ }$$ In other words, given $P$ and the functional form $XX^T$ there is no way to uniquely determine $X$, which is why your gradient has no inverse $-$ it doesn't exist.

On the other hand, given $X$ one can calculate $XX^T$ unambiguously, which is why your first gradient exists.