I have a problem about differentiating a matrix with trace as
$\frac{d T[KHP]}{dK}$,
where $T$ is the trace operation, K, H and P are three matrices and $P^{\top} = P$
In Kalman filter derivation, the result for this is
$(HP)^{\top}$
(refer to equation 11.24 and 11.25 in Here).
I am not quite understand it, and why the result is not $HP$. Thank you
I've heard some people scorn it but The Matrix Cookbook is full of useful information. For example, in the derivatives of traces section, one can see that
$$ \frac{\partial}{\partial {\bf X}}\text{Tr}({\bf XA}) = {\bf A}^\text{T} $$
Note that a trace is a linear operator so it commutes with the derivative. To prove the above relation, you could differentiate ${\bf XA}$ and then take the trace to arrive at the same result.