What is Fisher Information defined on?

62 Views Asked by At

The definition of Fisher Information on wikipedia is: \begin{equation} I(\theta) = E_\theta \left[\left(\dfrac{\partial}{\partial\theta}\ln p\left(X;\theta\right)\right)^2\right] \end{equation} Inside is a probability distribution. But in many places, it is defined as \begin{equation} I(\theta) = E_\theta \left[\left(\dfrac{\partial}{\partial\theta}\ln p\left(\theta;X\right)\right)^2\right] \end{equation} Inside is the likelihood function.

I was always warned to be careful about the difference between likelihood function and probability density function. So why they seems totally equivalent here?

Evenif they are equivalent, I found the likelihood seems easier to understand as the variant is always $\theta$, and it can directly link to the MLE estimation. But with the probability function I cannot really understand anything from it. Why would people define the Fisher information with probability function at all? What does it reveal? Or these 2 definitons makes 0 difference and people just write as will?