In order to estimate sparse precision matrices, there is a method called "penalised likelihood" which leads to this formula.
Can someone write down the demonstration ? I do not understand how we arrive to this equation... Thanks !
In order to estimate sparse precision matrices, there is a method called "penalised likelihood" which leads to this formula.
Can someone write down the demonstration ? I do not understand how we arrive to this equation... Thanks !
Copyright © 2021 JogjaFile Inc.
The equation is: $$\hat\Theta = argmin_{Theta} \{tr{S\Theta}-log det{\Theta} + \lambda\sum_{i\neq j}{|\Theta_{ij}|}\}$$
The purpose is to find the best estimate of $\Theta$ the inverse of covariance matrix, called precision matrix. We have to maximize the likelihood of a Gaussian distribution, with $\mu_x = 0$. $$ f(x)=\frac{1}{\sqrt{(2π)^n|\Sigma|}} exp\{−\frac{1}{2} x^T \Sigma^{-1} x \}$$ constrained by sparsity of precision matrix. We can substitute $\Sigma^{-1}$ with $\Theta$.
Usually for log-linear probabilities we solve them by getting logarithm of the distribution. Now that we want to find $\Theta$ instead of $\Sigma$, we can minimize the negate of logarithm of Gaussian likelihood.
That's where the first two terms show up.
The third term is the penalty term, for an optimization problem with Lagrange multiplier $\lambda$. Why is it penalized? Because we have the constraint of sparsity on the precision matrix. It is applied with limiting the norm $l_1$ (sum of absolute values of elements) of matrix $\Theta$.