What is the differential of $X'X$?

99 Views Asked by At

Let $X = (x_{ij})$ be a square matrix with $n\times n$ variables in $\mathbf{R}$. Could you tell me the $\text{d}(X'X)$ when $X$ has a full column rank?


[Update]

Honestly speaking. What I have got is: $$ \text{d} X'X = (\text{d} X')X + X'\text{d}X. \tag{1} $$

But the text book of matrix differential calculus with applications in statistics and econometrics seems give a different result at page 171 excercise 3 as stated below:

Show that $\text{d} \log |X'X| = 2 \text{ tr}(X'X)^{-1}X'\text{d}X$ at every point where $X$ has full column rank.

As I know: $$ \begin{align} \text{d} \log |X'X| &= \text{ d} \text{ tr}(\log|X'X|) \\ & = \text{ tr}(\text{d}\log|X'X|) \\ & = \text{ tr}((X'X)^{-1}\text{d}(X'X)) \\ \end{align} $$

With result in (1), I cannot obtain $\text{d} \log |X'X| = 2 \text{ tr}(X'X)^{-1}X'\text{d}X$ as stated above. Could you help me to solve this?

3

There are 3 best solutions below

0
On BEST ANSWER

After some exploration, I intend to answer my question.

Lets assume $X$ has $m\times n$ variables where $m \leq n$.

First we have $$ {\rm d} XX' = ({\rm d}X)X' + X({\rm d}X'). \tag{1} $$

Then

$$ \begin{align} {\rm d} |XX'| &= {\rm tr} \big( (XX')^\# {\rm d} (XX') \big) \\ & = {\rm tr} \bigg( (XX')^\# \big(({\rm d}X)X' + X({\rm d}X')\big)\bigg)\\ & = {\rm tr} \bigg( (XX')^\# ({\rm d}X)X' + (XX')^\#X{\rm d}X'\bigg)\\ & = {\rm tr} \bigg( X'(XX')^\# ({\rm d}X)\bigg) + {\rm tr}\bigg( (XX')^\#X{\rm d}X'\bigg)\\ & = {\rm tr} \bigg( X'(XX')^\# ({\rm d}X)\bigg) + {\rm tr}\bigg(X' (XX')^\#{\rm d}X \bigg)\\ & = 2 {\rm tr} \bigg( X'(XX')^\# ({\rm d}X)\bigg) \end{align} $$

Futher

$$ \begin{align} {\rm d} \log |XX'| &= \frac{1}{|XX'|} {\rm d} |XX'| \\ & = \frac{2}{|XX'|} {\rm tr} \bigg( X'(XX')^\# {\rm d}X\bigg) \\ & = 2\ {\rm tr} \bigg( X'\frac{(XX')^\#}{|XX'|} {\rm d}X\bigg) \\ & = 2\ {\rm tr} \bigg( X'(XX')^{-1} {\rm d}X\bigg) \\ \end{align} $$

1
On

It is often useful to take the derivative of a scalar-valued function or a vector-valued function with respect to a vector. I have not come across the situation where I need to take the derivative of a matrix. Hopefully, this explanation is helpful to you.

The first derivative of a scalar-valued function $f(\mathbf{x})$ with respect to a vector is called the gradient of $f(\mathbf{x})$ where $\mathbf{x} = [x_1 \;x_2]^T$. We can write this as

$$\nabla f (\mathbf{x}) = \frac{d}{d\mathbf{x}} f (\mathbf{x}) = \begin{bmatrix} \frac{\partial f}{\partial x_1} \\ \frac{\partial f}{\partial x_2} \end{bmatrix}$$

Therefore, we have

$$\frac{\partial}{\partial\mathbf{x}} \mathbf{x}^T \mathbf{x} = \frac{\partial}{\partial\mathbf{x}} (x_1^2 + x_2^2) = 2 \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = 2 \mathbf{x}$$

If we are taking the first derivative of a vector-valued function with respect to a vector, it is called the Jacobian. It is given by,

$$J (\mathbf{x}) = \frac{d}{d\mathbf{x}} f (\mathbf{x}) = \begin{bmatrix} \frac{\partial f_1}{\partial x_1} \frac{\partial f_1}{\partial x_2}\\ \frac{\partial f_2}{\partial x_1} \frac{\partial f_2}{\partial x_2} \end{bmatrix}$$

Edit: I just realized you said $\mathbf{x}$ is a matrix. However, I edited my answer for clarification.

1
On

The general rule for differentials is very simple $$\eqalign{ d(A\star B) &= dA\star B + A\star dB \cr }$$ where $\star$ can be the Hadamard, Kronecker, Dyadic, Frobenius, or normal matrix product, and the matrices $(A,B)$ are such that their dimensions are compatible with the specified product.


In your particular case, the rule tells us that $$\eqalign{ d\,(X^TX) &= dX^TX + X^TdX \cr\cr }$$