Alternatives to Fisher information

470 Views Asked by At

The Fisher information matrix is defined as the following:

$$\mathcal{I}(\theta)=E[(\frac{\partial \log f(x;\theta)}{\partial \theta})^2]=-E[\frac{\partial^2 \log f(x;\theta)}{\partial \theta \partial \theta^T}]$$

Where $f(x;\theta)$ is the probablity distribution function (pdf) of some random (vector) variable $x$ parameterized by (vector) parameter $\theta$. For an unbiased estimator $\mathcal{I}(\theta)^{-1}$ is an lower bound for the MSE of estimating the parameter $\theta$.

The reason that Fisher chose this definition for the Information measure was very intuitive: We are interested in relative changes in $f(x;\theta)$, i.e. $\frac{\partial f(x;\theta)/\partial \theta}{f(x;\theta)}=\frac{\partial \log f(x;\theta)}{\partial \theta}$.

In particular, we are interested in average of this quantity regardless of its sign. Therefore, $E[(\frac{\partial \log f(x;\theta)}{\partial \theta})^2]$, is a natural and usually tractable choice (with closed-form solution in many cases).

Here is my specific question: Are there alternatives to this definition?

For example, $\mathcal{I}_a(\theta)=E[|\frac{\partial \log f(x;\theta)}{\partial \theta}|]$, is the first that comes to my mind.

Are there other established formulations, or even totally different approaches that address the problem of lower-bounding the error of parameter estimation?

Thanks!