I'm looking to fuse the two sets of covariance matrices using the covariance union method outlined in the appendices of this paper.
However, when I try to calculate the example (page 4) using the method outlined in the appendix I get very different results. I was hoping someone could help me with the following queries:
- Could you provide me with a high level intuition of the approach (so I can diagnose) - as I can't see where I'm going wrong with the implementation.
- How would the mean (u) be calculated - the paper mentions minimising some measure of U, but I was hoping for some more intuition behind this.
- What is the benefit of this approach versus other methods of covariance fusion e.g. covariance intersection?
m1 = np.array([[0], [0]])
M1 = np.array([[10, -10], [-10, 20]])
m2 = np.array([[0], [0]])
M2 = np.array([[20, 10], [10, 80]])
u = np.array([[0], [0]])
U = np.array([[20.97, -1.08], [-1.08, 207.25]])
def cov_union(u, m1, M1, m2, M2):
U1 = M1 + (u - m1) @ np.transpose(u - m1)
U2 = M2 + (u - m2) @ np.transpose(u - m2)
S = np.linalg.cholesky(U2)
inv_S = np.linalg.inv(S)
inv_S_T = np.transpose(inv_S)
S_T = np.transpose(S)
D, V = np.linalg.eig(inv_S_T @ U1 @ inv_S)
V_T = np.transpose(V)
max_D_I = np.maximum(D, np.eye(2))
U = S_T @ V @ max_D_I @ V_T @ S
return U
The fusion of two normal distributions can be obtained by multiplying their probability density functions (PDF) together, and re-normalizing the end result. The PDF of multivariate normal distributions can be expressed using
$$ P_i(x) = \det(2\pi\,\Sigma_i)^{-\frac{1}{2}}\,\exp\!\left(-{\frac{1}{2}}(x - \mu_i)^\top \Sigma_i^{-1}(x - \mu_i)\right), \tag{1} $$
with $\mu$ the mean of the distribution and $\Sigma$ the covariance matrix.
Multiplying two such PDFs together and ignoring the gains $\det(2\pi\,\Sigma_i)^{-\frac{1}{2}}$ (since those don't matter anyways due to the re-normalization step) yields
\begin{align} P_1(x)\,P_2(x) &\propto \exp\!\left(-{\frac{1}{2}}(x - \mu_1)^\top \Sigma_1^{-1}(x - \mu_1)\right) \exp\!\left(-{\frac{1}{2}}(x - \mu_2)^\top \Sigma_2^{-1}(x - \mu_2)\right), \\ &\propto \exp\!\left(-{\frac{1}{2}}\left[(x - \mu_1)^\top \Sigma_1^{-1}(x - \mu_1) + (x - \mu_2)^\top \Sigma_2^{-1}(x - \mu_2)\right]\right). \end{align}
The expression inside the square brackets in the exponential can be simplified to some constant term independent from $x$ plus a quadratic expression of the form
$$ (x - \mu_f)^\top \Sigma_f^{-1}(x - \mu_f). \tag{2} $$
The constant term can be ignored, because that just translates to another gain when taking it out of the exponential and thus can be ignored due to the re-normalization step. It can be shown that the resulting fused mean $\mu_f$ and covariance matrix $\Sigma_f$ can be written as
\begin{align} \Sigma_f &= (\Sigma_1^{-1} + \Sigma_2^{-1})^{-1}, \tag{3} \\ \mu_f &= \Sigma_f\,(\Sigma_1^{-1} \mu_1 + \Sigma_2^{-1} \mu_2) = (\Sigma_1^{-1} + \Sigma_2^{-1})^{-1} (\Sigma_1^{-1} \mu_1 + \Sigma_2^{-1} \mu_2). \tag{4} \end{align}
If one prefers to take less matrix inverses it can be noted that that $\Sigma_f$ and $\mu_f$ can also be calculated using
\begin{align} \Sigma_f =&\,\Sigma_1\,(\Sigma_1 + \Sigma_2)^{-1} \Sigma_2, \tag{5a} \\ =&\,\Sigma_2\,(\Sigma_1 + \Sigma_2)^{-1} \Sigma_1, \tag{5b} \\ \mu_f =&\,\Sigma_2\,(\Sigma_1 + \Sigma_2)^{-1} \mu_1 + \Sigma_1\,(\Sigma_1 + \Sigma_2)^{-1} \mu_2. \tag{6} \end{align}