Help me understand this matrix derivative (for the LS estimation proof)

670 Views Asked by At

I'm trying to understand this proof of LS estimation, but I've never studied matrix calculus.

I've managed to find a couple of identities on the web and and I see how to get the first part of the derivative. But how do I get the second part ? Is there any recommended source with a simple explanation about matrix calculus ?

Also, what is the meaning of the asterisk ? Is that the complex conjugate left over ?

LS Proof

1

There are 1 best solutions below

0
On

The asterisk represents the complex conjugate, which is related to the hermitian conjugate and the transpose by $$X^H = (X^*)^T$$

Now consider the Frobenius norm of the matrix $M$, expressed in terms of the Frobenius (:) inner product, and find its differential $$\eqalign{ J &= \|M\|_F^2 = M^*:M \cr dJ &= M^*:dM \cr }$$ Note that for purposes of differentiation, $M^*$ can be considered to be independent of $M$.

Now it's time to substitute $(XH-Y)$ for $M$ $$\eqalign{ dJ &= (XH-Y)^*:X\,dH \cr &= X^T(XH-Y)^*:dH \cr &= (X^HXH-X^HY)^*:dH \cr }$$ Since $dJ=\Big(\frac{\partial J}{\partial H}:dH\Big),\,$ the gradient must be $$\eqalign{ \frac{\partial J}{\partial H} &= (X^HXH-X^HY)^* \cr }$$ Setting the gradient to zero, and taking the complex conjugate leads to a sytem of linear equations which can be solved for $H$
$$\eqalign{ X^HXH &= X^HY \cr H &= (X^HX)^{-1}X^HY \cr }$$