Find $\frac{dY}{dX}, Y=(X')^{2}B$ matrix derivative

101 Views Asked by At

I have the following problem:

Find the matrix derivative $\frac{dY}{dX}$, where $Y=(X')^2B$, matrix $X$ is $p \times q$ and $B$ is a given matrix.


I have gotten this far:

By matrix derivative definition we can write:

$$ \frac{dY}{dX} =\frac{d}{dvec'X} \otimes vec(Y)= \frac{d}{dvec'X} \otimes vec((X')^2B) $$

Using the vec property (v)

$$ vec(ABC) = (C'\otimes A)vec(B) $$

We can write the element $vec((X')^2B) = vec(X'X'B)\underbrace{=}_{(v)} (B'\otimes X')vec(X')$

so we get

$$ \frac{dY}{dX}=\frac{d}{dvec'X} \otimes \Big[(B'\otimes X')vec(X')\Big] $$ Using the Kroenecker product property $$ (A\otimes C)\cdot (B\otimes D) = (AB)\otimes(CD) $$ we can write

$$ 1\cdot \frac{d}{dvec'X} \otimes \Big[(B'\otimes X')vec(X')\Big] = \Big[1\otimes (B'\otimes X')\Big]\Big[\frac{d}{dvec'X} \otimes vec(X')\Big] = \Big[B'\otimes X'\Big]\Big[\frac{d}{dvec'X} \otimes vec(X')\Big] $$

So I feel like I am almost there, but I don't get the concept of the matrix differentiation and notation and the difference between $vec'X$ and $vec(X')$. I feel like we can somehow cancel out the last product...

Any tips appericated!

1

There are 1 best solutions below

0
On BEST ANSWER

Don't apply vectorization too early in the process.

The first step is to calculate the differential of your function. $$\eqalign{ Y &= X^TX^TB \\ dY &= \color{red}{dX^T}X^TB + X^T\color{red}{dX^T}B \\ }$$ The second step is vectorization. $$\eqalign{ \operatorname{vec}(dY) &= \left((X^TB)^T\otimes I\right)\operatorname{vec}(dX^T) + \left(B^T\otimes X^T\right)\operatorname{vec}(dX^T) \\ &= \left(B^TX\otimes I + B^T\otimes X^T\right)K\operatorname{vec}(dX) \\ }$$ where $K$ is the commutation matrix associated with Kronecker products, i.e. $$\eqalign{ \operatorname{vec}(A^T) &= K\operatorname{vec}(A) \\ }$$ Now it's a simple matter to identify the gradient as $$\eqalign{ \frac{\partial\operatorname{vec}(Y)}{\partial\operatorname{vec}(X)} &= \left(B^TX\otimes I + B^T\otimes X^T\right)K \\ }$$