Jacobian of the matrix transformation $Y = X^T X$

195 Views Asked by At

Take $Y=X^TX$, where $X$ is $d\times d$. The differential $\mathrm{d}Y$ can be computed as follows:

\begin{equation} \mathrm{d}Y = \mathrm{d}(X^T)X + X^T\mathrm{d}(X) \end{equation}

Using the property

\begin{equation} \mathrm{vec}(ABC) = (C^T\otimes A)\mathrm{vec}B \end{equation}

where vec() denotes the vectorization operation, this becomes

\begin{equation} \mathrm{dvec}(Y) = (X^T\otimes I_d)\mathrm{dvec}(X^T) + (I_d\otimes X^T)\mathrm{dvec}(X) \end{equation}

To handle $\mathrm{dvec}(X^T)$, we will need the commutation matrix. The $mn\times mn$ commutation matrix $K_{mn}$ is defined to be the unique permutation matrix such that, for an $m\times n$ matrix $A$,

\begin{equation} K_{mn}\mathrm{vec}(A) = \mathrm{vec}(A^T) \end{equation}

The commutation matrix also has the following properties (where $A$ is $m\times n$ and $B$ is $p\times q$):

\begin{equation} K_{nm}K_{mn} = I_{mn} ~~~~~~~~~~\mathrm{and}~~~~~~~~~~ (A\otimes B)K_{nq} = K_{mp}(B\otimes A) \end{equation}

Using these results, our differential becomes

\begin{equation} \mathrm{dvec}(Y) = (K_{dd}+I_{d^2})(I_d\otimes X^T)\mathrm{dvec}(X) \end{equation}

which further implies that

\begin{equation} \mathrm{d}Y = |(K_{dd}+I_{d^2})(I_d\otimes X^T)|\mathrm{d}X = 2^{d^2}\bigg|\frac{1}{2}(K_{dd}+I_{d^2})\bigg||X|^d \mathrm{d}X \end{equation}

This is where my problem comes in. The matrix $M = \frac{1}{2}(K_{dd}+I_{d^2})$ is idempotent and not equal to the identity, so $|M|=0$. It doesn't seem correct that $\mathrm{d}Y$ would equal zero, so I must be misunderstanding something. Any help catching my error would be appreciated.

References:

  • Mathai, A. M., Jacobians of Matrix Transformation and Functions of Matrix Arguments
  • Abadir, K.M., Magnus, J.R., Matrix Algebra
1

There are 1 best solutions below

2
On BEST ANSWER

$ \def\LR#1{\left(#1\right)} \def\op#1{\operatorname{#1}} \def\vecc#1{\op{vec}\LR{#1}} \def\qiq{\quad\implies\quad} \def\qnq{\quad\implies\quad} \def\p{\partial} \def\grad#1#2{\frac{\p #1}{\p #2}} $There is nothing wrong with your analysis, the determinant is zero. Here's another way to think about the problem.

Given a matrix $X$ with no special structure, construct the symmetric matrix $Y=X^TX$

Now consider the Cholesky factorization (triangular factors) $$\eqalign{ Y &= LL^T = U^TU \qiq U=L^T \qquad\qquad \\ }$$ or the Singular Value factorization (orthogonal factors) $$\eqalign{ Y &= VDV^T = W^TW \qiq W = D^{1/2}V^T \\ }$$ or an Eigenvalue factorization...

There are many factorizations of the form $Y=M^TM\:$ but the factors are all very different. So while $Y$ is a function $X$, it is not correct to think of $X$ as a function of $Y$.

Looking at the vectorized differential in terms of the Jacobian matrix $J$ $$\eqalign{ y &= \vecc{Y},\quad x=\vecc{X} \qiq dy = J\:dx \\ }$$ It shouldn't be too surprising that $J^{-1}$ does not exist, since $x$ is not a function of $y$.
However, this does not mean that $J$ is equal to zero.