Derivation of $\frac{\partial{J}}{\partial{K}}$ with $J=\text{Tr}((I-KC)P(I-KC)^T)$

82 Views Asked by At

This is an equation that occurs in the derivation of the Kalman filter:

$$J = \text{Tr}((I-KC)P(I-KC)^T),$$

where $\text{Tr}(\cdot)$ is the trace of the argument, and $K \in \mathbb{R}^{n \times q}, \ C \in \mathbb{R}^{q \times n}, \ P \in \mathbb{R}^{n \times n} \text{ symmetric}, \ I \in \mathbb{R}^{n \times n}$ is the identity matrix.

Does someone know how to derive $\frac{\partial{J}}{\partial{K}}$ in closed form?

Update: Here's my attempt to solve the problem. From the matrix cookbook we know (equation 111):

$$\frac{\partial \text{Tr}(XBX^T)}{\partial X} = 2XB$$ if $B$ is symmetric, that is $B=B^T$. With the chain-rule (I guess I'm not using it right) it follows

$$\frac{\partial{J}}{\partial{K}} = \frac{\partial (I-KC)}{\partial K} \frac{\text{Tr}((I-KC)P(I-KC)^T)}{\partial (I-KC)} = -2C(I-KC)P$$

But that's not the right answer, it should be slightly different, namly

$$\frac{\partial{J}}{\partial{K}} = -2(I-KC)PC^T$$

2

There are 2 best solutions below

0
On BEST ANSWER

Here's my solution:

$J = \text{Tr}((I-KC)P(I-KC)^T) = \text{Tr}(P-PC^TK^T-KCP+KCPC^TK^T) \\ \,\,\,\, = \text{Tr}(P) - \text{Tr}(PC^TK^T) - \text{Tr}(KCP) + \text{Tr}(KCPC^TK^T)$

From the maxtrix cookbook 1 (equation (111)) we do know that

$\frac{\partial}{\partial X} \text{Tr}(XBX^T) = XB^T+XB \overset{\text{if B sym.}}{=} 2XB$.

Using this we get:

$\frac{\partial J}{\partial K} = \frac{\partial \text{Tr}(P)}{\partial K} - \frac{\partial \text{Tr}(PC^TK^T)}{\partial K} - \frac{\partial \text{Tr}(KCP)}{\partial K} + \frac{\partial \text{Tr}(KCPC^TK^T)}{\partial K} \\ \,\,\,\,\,\,\,\, = 0-PC^T-P^TC^T+K(CPC^T)^T+KCPC^T \\ \,\,\,\, \overset{\text{$P=P^T$}}{=} -2(I-KC)PC^T$

I validated the result via numerical derivation.

4
On

To use the cookbook result, first define the matrices $$\eqalign{ B &= P \cr X &= KC-I \cr }$$ Then put the result in differential form and change the independent variable from $X$ to $K$ $$\eqalign{ dJ &= 2XB:dX \cr &= 2XB:(dK\,C) \cr &= 2XBC^T:dK \cr }$$ Finally put the result back into gradient form $$\eqalign{ \frac{\partial J}{\partial K} &= 2XBC^T \cr &= 2(KC-I)PC^T \cr }$$ In the above, a colon was used to represent the trace/Frobenius product, i.e. $$A:B = {\rm tr}(A^TB)$$