Matrix derivative 1

94 Views Asked by At

What is the derivative of $(Y-BX)(Y-BX)^{T}$ with respect to $X$, where $X$, $Y$, and $B$ are all matrices?

Through my calculations, the answer should be something like $ (XA)^{T}(Y-BX)$. But i don't know what is $A$ matrix. Can you please tell me what is the general rule , or if there is a table that I can take a look at it?

1

There are 1 best solutions below

2
On BEST ANSWER

One general rule, if you don't know how to differentiate a multivalued function, is to try to start with a directional derivate in order to reduce the problem to a one dimensional one. In your example, if we abbreviate your expression as $F(X)$ this would mean to look at (I guess my $V$ is your $A$)

$$ DF(X)(V) = \frac {d}{dt}|_{t=0} (Y- B(X+tV))(Y-B(X+tV))^T$$

Here you can simply apply the product rule and linearity to arrive at $$(-BV)(Y-BX)^T + (Y-BX)(-BV)^T$$

Whether this can be simplified I did not check, this may depend on the choice of matrices and their dimensions.

(of course, if $X$ is an $m\times n$ matrix, $V$ also has to be chosen as an (arbitrary) $m\times n$ matrix)