Explanations on Scipy Gaussian Mixture score computation

Question

Explanations on Scipy Gaussian Mixture score computation

1.2k Views Asked by Bumbble Comm At 09 Apr 2026 - 6:43

I'm looking at Scipy's way of calculating the score of a sample in a Gaussain mixture model, and there is something I don't get.

I'm taking as reference this definition: $$ \begin{align} p(\vec{x}) &= \sum_{i=1}^K\phi_i \mathcal{N}(\vec{x} \;|\; \vec{\mu}_i, \Sigma_i)\\ \mathcal{N}(\vec{x} \;|\; \vec{\mu}_i, \Sigma_i) &= \frac{1}{\sqrt{(2\pi)^K|\Sigma_i|}} \exp\left(-\frac{1}{2}(\vec{x}-\vec{\mu}_i)^\mathrm{T}{\Sigma_i}^{-1}(\vec{x}-\vec{\mu}_i)\right)\\ \sum_{i=1}^K\phi_i &= 1 \end{align} $$ where $K$ is the number of components, $\mu_k$ and $\Sigma_k$ are the mean and the covariance matrix of the $k^{th}$ component, and $\phi_k$ is the weight of the $k^{th}$ component in the mixture.

Scipy computes $\log(p(\vec{x}))$ in the function score_samples, in files base.py and gaussian_mixture.py:

class BaseMixture[...]:
[...]
    def score_samples(self, X):
        """Compute the weighted log probabilities for each sample."""
        [...]
        return logsumexp(self._estimate_weighted_log_prob(X), axis=1)

    def _estimate_weighted_log_prob(self, X):
        """Estimate the weighted log-probabilities, log P(X | Z) + log weights."""
        [...]
        return self._estimate_log_prob(X) + self._estimate_log_weights()

Ok, so here they just simplify future calculations by: $$ \log(p(\vec{x})) = \log\left(\sum_{i=1}^K\exp(\log(\phi_i) + log(\mathcal{N_i}))\right) $$ _estimate_log_weights just returns what its name says:

class GaussianMixture(BaseMixture):
    def _estimate_log_weights(self):
        return np.log(self.weights_)

My issue is with _estimate_log_prob:

    def _estimate_log_prob(self, X):
        return _estimate_log_gaussian_prob(X, self.means_, 
            self.precisions_cholesky_, self.covariance_type)

where self.means_ is $\mu$ et self.precisions_cholesky_ is "the Cholesky decomposition of the precision matrices of each mixture component. The precision matrix is the inverse of the covariance matrix", $\Sigma$ in the original definition.

If we look at the definition of _estimate_log_gaussian_prob, we have:

    def _estimate_log_gaussian_prob(X, means, precisions_chol, covariance_type):
        """Estimate the log Gaussian probability."""            
        n_samples, n_features = X.shape
        n_components, _ = means.shape

        # det(precision_chol) is half of det(precision)
        log_det = _compute_log_det_cholesky(
            precisions_chol, covariance_type, n_features)
        [...]
        log_prob = np.empty((n_samples, n_components))
        for k, (mu, prec_chol) in enumerate(zip(means, precisions_chol)):
            y = np.dot(X, prec_chol) - np.dot(mu, prec_chol)
            log_prob[:, k] = np.sum(np.square(y), axis=1)
        [...]
    return -.5 * (n_features * np.log(2 * np.pi) + log_prob) + log_det

This is supposed to compute: $$ \log(\mathcal{N}(\vec{x} \;|\; \vec{\mu}_i, \Sigma_i)) = \log(\frac{1}{\sqrt{(2\pi)^K|\Sigma_i|}}) + \left(-\frac{1}{2}(\vec{x}-\vec{\mu}_i)^\mathrm{T}{\Sigma_i}^{-1}(\vec{x}-\vec{\mu}_i)\right)\\ = - 0.5 K \log(2\pi)-0.5 \log(|\Sigma_i|) + \left(-\frac{1}{2}(\vec{x}-\vec{\mu}_i)^\mathrm{T}{\Sigma_i}^{-1}(\vec{x}-\vec{\mu}_i)\right) $$

At this point, I have two questions:

Aren't log_prob and log_det inverted? (Why isn't it return -.5 * (n_features * np.log(2 * np.pi) + log_det) + log_prob?)
Is log_prob really calculating $-\frac{1}{2}(\vec{x}-\vec{\mu}_i)^\mathrm{T}{\Sigma_i}^{-1}(\vec{x}-\vec{\mu}_i)$ ? How?

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

As you have stated in the question the log probability of the multivariate gaussian is as follows. I'll take a single component to simplify notation. $$ \log{\mathcal{N}(x | \mu, \Sigma)} = -\frac{1}{2}K\log{2\pi} - \frac{1}{2}\log{|\Sigma|} - \frac{1}{2} (x - \mu)^T\Sigma^{-1}(x-\mu) \tag{1} $$

The Scipy algorithm is using the Cholesky decomposition of the precision matrices. The precision is the inverse of the covariance, i.e., $$ \Lambda = \Sigma^{-1} $$ Let $L$ denote the Cholesky decomposition of a particular precision matrix, so that, $$ \Lambda= LL^T $$

Computing the log_prob

Starting from the Mahalanobis term in Equation (1), $$ \begin{align} ( x - \mu )^T \Sigma^{-1} (x - \mu ) =& ( x - \mu )^T \Lambda (x - \mu ) \\ =& ( x - \mu )^T LL^T (x - \mu ) \\ =& \|L^T (x - \mu )\|^2 \end{align} $$

The algorithm first computes $ y = L^T (x - \mu_i )$ using the dot function, and then takes the norm squared, i.e. $log\_prob = \|y\|^2$. It then needs to multiply by a factor of $-\frac{1}{2}$, which happens in the return statement.

Computing the log_det

Starting from the log determinant term in equation (1), $$ \begin{align} -\frac{1}{2}\log{|\Sigma|} =& -\frac{1}{2} \log{|\Lambda^{-1}|} \\ =& \frac{1}{2}\log{|\Lambda|} \\ =& \frac{1}{2}\log{|LL^T|} \\ =& \log{|L|} \end{align} $$

In the return statement, $log\_det = \det{L}$, and so there is no need to multiply by $-\frac{1}{2}$.

The whole computation

In summary, the final computation in Scipy is, $$ \log{\mathcal{N}(x | \mu, \Sigma)} = -\frac{1}{2} \left( K\log{2\pi} + \|L^T(x - \mu)\|^2 \right) +\log{|L|} $$

Explanations on Scipy Gaussian Mixture score computation

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in SCORING-ALGORITHM

Trending Questions

Popular # Hahtags

Popular Questions