How to take the derivative of Matrices

49 Views Asked by At

I was browsing the derivation of the Least Squares estimates and stumbled about this problem. It said that:

$$E = (Y + XB)^2$$

$$\frac{dE}{dB} = -X^TY + X^TXB$$

It is to my understanding that the expansion of E would be $Y^2 + X^2B^2 - 2XYB$

and therefore $\frac{dE}{dB}$ would be $-2X^TY + X^TXB$

Am I approaching this wrong?

1

There are 1 best solutions below

2
On

This is the incorrect expression for $E$. You're neglecting the noncommuting nature of matrices. The actual expression is

$$E = (Y+XB)(Y+XB) = Y^2 + YXB + XBY + XBXB.$$

You have to multiply the matrices from left-to-right.


In the event that $Y$ and $B$ are meant to be vectors and $B$ a matrix, this really should be written more as

$$E = (Y+XB)^T(Y+XB).$$

So then

$$E = (Y^T + B^T X^T)(Y + XB) = Y^T Y + Y^T XB + B^T X^TY + B^T X^T XB.$$