Partial derivative of the likelihood function respect to $\sigma^2$

1.6k Views Asked by At

I am having problem doing the partial derivative of the likelihood function which is

$L(\mu,\sigma^2)=\frac{1}{\sigma\sqrt{2\pi}^n}\times \exp{(-\frac{1}{2\sigma^2}\sum(x_i-\mu)^2)}$

If the first part has solved that the $\hat{\mu}$ is $\bar{x}$ and plug this to the $L(x,\mu,\sigma)$ I wonder may I ask how to make the partial derivative to the $\sigma^2$ and the answer is $-n/2\sigma^2+1/2\sigma^4\times\sum(x_i-\mu)^2$

Thank you! I guess I did something wrong so I tried two times but did not get the above answer.

Appreciated!

1

There are 1 best solutions below

8
On BEST ANSWER

We generally work with the log likelihood. Like the log function is increasing, the maximum of the likelihood is also the maximum of the log likelihood.

$$ \log{L(\mu,\sigma^2)}=-\frac{n}{2}\log{\sigma^2}-\frac{1}{2\sigma^2}\sum\limits_{i=1}^n(x_i-\mu)^2+C $$

where $C$ is a constant term (that does not depend on $\sigma$ or $\mu$). This constant is generally dropped because it does not play any role in our maximization.

Also note that in order to express everything in $\sigma^2$ (and not in $\sigma$) I have used this "trick": $$ \log{\sigma}=\log{(\sigma^2)^{1/2}}=\frac{1}{2}\log{\sigma^2} $$ (valid because we assume $\sigma>0$)

Now you can compute your partial derivatives:

$$ \frac{\partial}{\partial \mu}(\log{L(\mu,\sigma^2)})=\frac{1}{\sigma^2}\sum _{i=1}^n \left(x_i-\mu \right) $$

$$ \frac{\partial}{\partial \sigma^2}(\log{L(\mu,\sigma^2)})=\frac{\sum _{i=1}^n \left(x_i-\mu \right)^2}{2 \sigma ^4}-\frac{n}{2 \sigma ^2} $$

Next to get $(\mu,\sigma^2)$ you must solve $\nabla\log{L(\mu,\sigma^2)}=\mathbf{0}$, that is:

$$ \sum _{i=1}^n \left(x_i-\mu \right) = 0 $$

$$ \sum _{i=1}^n \left(x_i-\mu \right)^2=n\sigma^2 $$

($\mu=\bar{x}$ and $\sigma^2=\frac{1}{n}\sum _{i=1}^n \left(x_i-\bar{x}\right)^2$)


update (see comment): clarification concerning $$ \frac{\partial}{\partial \sigma^2}(-\frac{1}{2\sigma^2})=\frac{1}{2\sigma^4} $$ which can be a little confusing.

You must realize that "$\sigma^2$" must be interpreted purely as a symbol. The previous calculation must be thought as follows:

In order to compute $$ \frac{\partial}{\partial \sigma^2}(-\frac{1}{2\sigma^2}) $$ what you actually do is to replace the "symbol" $\sigma^2$ by $x$: $$ \frac{\partial}{\partial x}(-\frac{1}{2x}) = \frac{1}{2x^2} $$ then you reintroduce $\sigma^2$, the complete story is: $$ \frac{\partial}{\partial \sigma^2}(-\frac{1}{2\sigma^2}) \equiv \frac{\partial}{\partial x}(-\frac{1}{2x}) = \frac{1}{2x^2} \equiv \frac{1}{2(\sigma^2)^2}= \frac{1}{2\sigma^4} $$


Another example (assuming that $\sigma>0$) would be: $$ \frac{\partial}{\partial \sigma^2}\frac{1}{\sigma}=\frac{\partial}{\partial \sigma^2}\frac{1}{\sqrt{\sigma^2}}\equiv\frac{\partial}{\partial x}\frac{1}{\sqrt{x}}=-\frac{1}{2 x^{3/2}}\equiv -\frac{1}{2 (\sigma^2)^{3/2}}=-\frac{1}{2\sigma^3} $$