Direct solution to maximum likelihood computation problem using the derivative of multivariate Gaussian w.r.t. covariance matrix

Question

Direct solution to maximum likelihood computation problem using the derivative of multivariate Gaussian w.r.t. covariance matrix

194 Views Asked by Bumbble Comm At 26 Mar 2026 - 6:25

For an application, I need to compute the maximum loglikelihood of data coming from a $d$-dimensional multivariate Gaussian random variable: $$ \textbf{x} \sim \mathcal{N}(\boldsymbol{\mu}, \Sigma) $$ where the covariance matrix $\Sigma$ is a function two scalars $\sigma, \gamma$ such that $\Sigma = \sigma V + \gamma I$ and $V$ is symmetric and independent of both $\gamma$ and $\sigma$.

I have already computed the maximum likelihood using R optim function. However, I was wondering if I could directly compute the optimal $\sigma$ and $\gamma$ using some closed-form expression.

For this, I tried to follow the steps given in the answer provided by @greg for a similar question (Derivation of derivative of multivariate Gaussian w.r.t. covariance matrix). However I am stuck at the step which involves expanding $\Sigma^{-1} = (\sigma V + \gamma I)^{-1}$.

My derivations are as follows. Using the same notations in the answer for $\Sigma=S$, $Z = X-\mu1$, $A : B = \text{tr}(A^TB)$,

\begin{align} dL &= (S^{-1} - S^{-1} Z Z^T S^{-1}) : dS \\ &= (S^{-1} - S^{-1} Z Z^T S^{-1}) : (d\sigma V + \sigma dV + d\gamma I) \end{align}

Setting $dV=0$ and $d\gamma=0$ to get partial derivative w.r.t $\sigma$, \begin{align} dL &= (S^{-1} - S^{-1} Z Z^T S^{-1}) : (d\sigma V) \\ & = d\sigma\, \text{tr}((S^{-1} - S^{-1}Z Z^T S^{-1})^TV) \end{align}

That implies, \begin{align} \frac{dL}{d\sigma} &= \text{tr}((S^{-1} - S^{-1}Z Z^T S^{-1})^TV) \\ & = \text{tr}((\sigma V + \gamma I)^{-1} - (\sigma V + \gamma I)^{-1} Z Z^T (\sigma V + \gamma I)^{-1})^T V) \end{align}

Similarly, by setting $dV=0$ and $d\sigma=0$, we get partial derivative w.r.t $\gamma$, \begin{align} \frac{dL}{d\gamma} &= \text{tr}(\sigma V + \gamma I)^{-1} - (\sigma V + \gamma I)^{-1} Z Z^T (\sigma V + \gamma I)^{-1}) \end{align}

Is there a way to solve $\frac{dL}{d\sigma}=0$ and $\frac{dL}{d\gamma}=0$ to get closed-form solution for $\gamma$ and $\sigma$?

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

You might be better off solving the gradient for $S$, i.e. $$\eqalign{ \frac{\partial L}{\partial S} &= \big(S^{-1} - S^{-1}ZZ^TS^{-1}\big) \;=\; 0 \cr S^{-1} &= S^{-1}ZZ^TS^{-1} \cr S &= ZZ^T \cr }$$ Then find the values of $(\sigma,\gamma)$ which yield (in a least-squares sense) this matrix. $$\eqalign{ \min_{\sigma,\gamma} \; \Big\|\,\sigma V + \gamma I - ZZ^T\Big\|^2_F \cr }$$ Start with an easy problem whose solution is well known. $$\eqalign{ \min_\alpha \|\alpha A-C\|^2_F \implies \alpha = \frac{A:C}{A:A} \cr }$$ Setting $\,\alpha A=\sigma V$ and $C=(ZZ^T-\gamma I),\,$ and then
setting $\,\alpha A=\gamma I\,$ and $C=(ZZ^T-\sigma V)\,$ yields the scalars. $$\eqalign{ \sigma = \frac{V:(ZZ^T-\gamma I)}{V:V},\quad \gamma = \frac{I:(ZZ^T-\sigma V)}{I:I} \cr }$$ Plug the $\gamma$-expression into the $\sigma$-expression (and vice versa) to obtain $$\eqalign{ \sigma &= \frac{(I:I)(V:ZZ^T) - (V:I)(I:ZZ^T)}{(I:I)(V:V) - (V:I)(I:V)},\quad \gamma &= \frac{(V:V)(I:ZZ^T) - (I:V)(V:ZZ^T)}{(V:V)(I:I) - (I:V)(V:I)} \cr }$$ Note that the formulas are conjugate to one other, under the interchange of $\,I\Longleftrightarrow V$.

Direct solution to maximum likelihood computation problem using the derivative of multivariate Gaussian w.r.t. covariance matrix

There are 1 best solutions below

Related Questions in PARTIAL-DERIVATIVE

Related Questions in INVERSE

Related Questions in TRACE

Related Questions in MAXIMUM-LIKELIHOOD

Trending Questions

Popular # Hahtags

Popular Questions