In publication I am currently studying, the following identity shows up, and I don't quite understand it. Reformulated from the manuscript, they posit:
$$\min_{\Theta \succ 0}f(\Theta):= -\log \det(\Theta)+\operatorname{trace}(S\Theta) + \lambda||\Theta||_1 \tag{1}$$
where $\Theta$ is a target precision matrix (inverse of a covariance matrix $\Sigma$) we want to estimate, $S$ is a sample covariance matrix (an approximation to $\Sigma$), $\lambda$ is a scalar (a regularization penalty), and $||\Theta||_1$ is the sum of the absolute values of $\Theta$. The authors now state (slightly adapted) the following:
We use the frame-work of "normal equations". Using sub-gradient notation, we can write the optimality conditions (aka "normal equations") for the Equation 1 as
$$-\Theta^{-1}+S+\lambda\Gamma \tag{2}=0$$
where $\Gamma$ is a matrix of component-wise signs of $\Theta$.
I am not particularly familiar with this field, and I don't see how this expression follows from the equation above. There is also a mention of "global stationarity conditions of Equation 1". Do you understand how the authors went from Equation 1 to Equation 2, and can offer some intuition about the steps between?