How to fuse covariance matrices

165 Views Asked by At

I'm looking to fuse the two sets of covariance matrices using the covariance union method outlined in the appendices of this paper.

However, when I try to calculate the example (page 4) using the method outlined in the appendix I get very different results. I was hoping someone could help me with the following queries:

  • Could you provide me with a high level intuition of the approach (so I can diagnose) - as I can't see where I'm going wrong with the implementation.
  • How would the mean (u) be calculated - the paper mentions minimising some measure of U, but I was hoping for some more intuition behind this.
  • What is the benefit of this approach versus other methods of covariance fusion e.g. covariance intersection?
m1 = np.array([[0], [0]])
M1 = np.array([[10, -10], [-10, 20]])

m2 = np.array([[0], [0]])
M2 = np.array([[20, 10], [10, 80]])

u = np.array([[0], [0]])
U = np.array([[20.97, -1.08], [-1.08, 207.25]])
def cov_union(u, m1, M1, m2, M2):
    U1 = M1 + (u - m1) @ np.transpose(u - m1)
    U2 = M2 + (u - m2) @ np.transpose(u - m2)

    S = np.linalg.cholesky(U2)
    inv_S = np.linalg.inv(S)
    inv_S_T = np.transpose(inv_S)
    S_T = np.transpose(S)

    D, V = np.linalg.eig(inv_S_T @ U1 @ inv_S)

    V_T = np.transpose(V)
    max_D_I = np.maximum(D, np.eye(2))

    U = S_T @ V @ max_D_I @ V_T @ S

    return U
1

There are 1 best solutions below

1
On BEST ANSWER

The fusion of two normal distributions can be obtained by multiplying their probability density functions (PDF) together, and re-normalizing the end result. The PDF of multivariate normal distributions can be expressed using

$$ P_i(x) = \det(2\pi\,\Sigma_i)^{-\frac{1}{2}}\,\exp\!\left(-{\frac{1}{2}}(x - \mu_i)^\top \Sigma_i^{-1}(x - \mu_i)\right), \tag{1} $$

with $\mu$ the mean of the distribution and $\Sigma$ the covariance matrix.

Multiplying two such PDFs together and ignoring the gains $\det(2\pi\,\Sigma_i)^{-\frac{1}{2}}$ (since those don't matter anyways due to the re-normalization step) yields

\begin{align} P_1(x)\,P_2(x) &\propto \exp\!\left(-{\frac{1}{2}}(x - \mu_1)^\top \Sigma_1^{-1}(x - \mu_1)\right) \exp\!\left(-{\frac{1}{2}}(x - \mu_2)^\top \Sigma_2^{-1}(x - \mu_2)\right), \\ &\propto \exp\!\left(-{\frac{1}{2}}\left[(x - \mu_1)^\top \Sigma_1^{-1}(x - \mu_1) + (x - \mu_2)^\top \Sigma_2^{-1}(x - \mu_2)\right]\right). \end{align}

The expression inside the square brackets in the exponential can be simplified to some constant term independent from $x$ plus a quadratic expression of the form

$$ (x - \mu_f)^\top \Sigma_f^{-1}(x - \mu_f). \tag{2} $$

The constant term can be ignored, because that just translates to another gain when taking it out of the exponential and thus can be ignored due to the re-normalization step. It can be shown that the resulting fused mean $\mu_f$ and covariance matrix $\Sigma_f$ can be written as

\begin{align} \Sigma_f &= (\Sigma_1^{-1} + \Sigma_2^{-1})^{-1}, \tag{3} \\ \mu_f &= \Sigma_f\,(\Sigma_1^{-1} \mu_1 + \Sigma_2^{-1} \mu_2) = (\Sigma_1^{-1} + \Sigma_2^{-1})^{-1} (\Sigma_1^{-1} \mu_1 + \Sigma_2^{-1} \mu_2). \tag{4} \end{align}

If one prefers to take less matrix inverses it can be noted that that $\Sigma_f$ and $\mu_f$ can also be calculated using

\begin{align} \Sigma_f =&\,\Sigma_1\,(\Sigma_1 + \Sigma_2)^{-1} \Sigma_2, \tag{5a} \\ =&\,\Sigma_2\,(\Sigma_1 + \Sigma_2)^{-1} \Sigma_1, \tag{5b} \\ \mu_f =&\,\Sigma_2\,(\Sigma_1 + \Sigma_2)^{-1} \mu_1 + \Sigma_1\,(\Sigma_1 + \Sigma_2)^{-1} \mu_2. \tag{6} \end{align}