Could anyone explain how is this Matlab code for log likelihood of Gaussian distribution implemented?

1.3k Views Asked by At

I was implementing Expectation-maximization algorithm for gaussian mixture module. However I realized my implementation has very bad accuracy. Then I found some code on internet.

It seems this function calculates the log likelihood of multi-variant Gaussian distribution, which should be defined as:

$$\ln(L) = -\frac{1}{2}\ln (|\boldsymbol\Sigma|\,) -\frac{1}{2}(\mathbf{z}-\boldsymbol\mu)^\dagger\boldsymbol\Sigma^{-1}(\mathbf{z}-\boldsymbol\mu) - \frac{k}{2}\ln(2\pi)$$

where $\boldsymbol\Sigma$ is covariance matrix and $\mathbf{z}$ is data points, $\boldsymbol\mu$ is mean and k is the dimension of the data.

But the author $\textit{Michael Chen}$ implemented the code as follows.

function y = loggausspdf(X, mu, Sigma)
    d = size(X,1);
    X = bsxfun(@minus,X,mu);
    [U,p]= chol(Sigma);
    if p ~= 0
        error('ERROR: Sigma is not PD.');
    end
    Q = U'\X;
    q = dot(Q,Q,1);  % quadratic term (M distance)
    c = d*log(2*pi)+2*sum(log(diag(U)));   % normalization constant
    y = -(c+q)/2;
end

Can anyone explain how is this function implemented.

thx !

2

There are 2 best solutions below

2
On BEST ANSWER

The matrix $\Sigma$ is positive definite, so it can be written in terms of a lower triangular matrix and its (conjugate) transpose using Cholesky factorization:

$$\Sigma = L L^{*},$$

or alternatively an upper triangular matrix:

$$\Sigma = U^{*} U.$$

This can be inverted: $\Sigma^{-1} = (U^*U)^{-1} = U^{-1}U^{-*}$.

Now, our inner term looks like

$$x^*\Sigma^{-1}x = x^*U^{-1}U^{-*}x.$$

Note that $U^{-*}x$ is the solution to the equation $U^*v = x$. In MATLAB, we can solve such a linear equation using the backslash command as U'\X.

Finally, noting that $x^*U^{-1}U^{-*}x$ is symmetric, and that they are vectors, we essentially have

$$\underbrace{x^*U^{-1}}_{v^*}\underbrace{U^{-*}x}_{v} = v^* v = v\cdot v.$$

3
On

Hint: For any symmetric positive-definite matrix $\Sigma$ the Cholesky decomposition is given by $\Sigma = U U^t$ where $U$ is triangular; this can be efficiently computed. Now, $x^t \Sigma^{-1} x = x^t {(U^{-1})}^{t} {U^{-1}} x = y^t y $ where $y=U^{-1} x$. This is what the code is computing.