Why do we have zero as the value of the derivative at optimum

156 Views Asked by At

$\newcommand{\dd}{\partial}$When searching for the mean and variance of a Gaussian distribution, we are trying to minimise this equation: $$ \hat{\mu}, \hat{\sigma} = \arg\min_{\mu,\sigma} \sum_{i=1}^{n} \left\{\frac{(x_{i} - \mu)^{2}}{2\sigma^{2}} + \ln\sigma\right\}. $$

However, why do we need to minimise these derivatives at optimum?

How do we know that the point which has the derivative $0$ is a global minimum and not a global maximum?

  • At optimum, $$ \frac{\dd J}{\dd \mu} = 0 \to \hat{\mu},\qquad\qquad \frac{\dd J(\hat{\mu}, \sigma)}{\dd\sigma} = 0 \to \hat{\sigma}. $$
1

There are 1 best solutions below

3
On

What you are doing by doing those computations is computing the maximum likelihood estimators for $\mu$ and $\sigma$. The likelihood is maximised precisely when the first expression you listed is minimised (this is not very hard to check). To minimise, take the derivative and set equal to zero.