Estimating Fisher information of multidimensional PDF with direct sampling

31 Views Asked by At

Let $f_\theta({\bf x})$ be the probability density for a vector ${\bf x}$ with $N$ elements. It is conditioned on a parameter $\theta$. I do not know the explicit form of $f_\theta$. However, I am able to sample somewhat efficiently from $f_\theta$ for a given $\theta$.

I wish to estimate the Fisher information for the distribution $f_\theta$ with respect to the parameter $\theta$, namely,

$$ I(\theta) = \mathbb{E}_\theta \left\{ \left[ \frac{\partial}{\partial \theta}\ln f_\theta({\bf x}) \right]^2 \right\} $$

I would like to do that with a number of samples of ${\bf x}$ that does not grow exponentially in $N$. The issue is that I do not know how to handle the derivative. Does someone know a good way to do this?

My best idea so far is to

  1. pick two parameters $\theta_1$ and $\theta_2$ that are close to each other;
  2. sample $x$ a finite number of times for both $f_{\theta_1}$ and $f_{\theta_2}$;
  3. use the samples to estimate the Kullback-Leibler divergence between $f_{\theta_1}$ and $f_{\theta_2}$;
  4. use the relationship between the KL divergence and the Fisher information to estimate the Fisher information.

However, I'm not yet sure that step 3 gets rid of the dimensionality problem. The entire problem is that I cannot build the histograms of $f_\theta({\bf x})$ in a reasonable time as $N$ increases. And I need these histograms to estimate the KL divergence.

Edit: I found this, which might offer some answers.

Edit 2: I think it can be done when $f_\theta$ describes a Markov chaine for ${\bf x}$. Then the value of $f_\theta({\bf x})$ can be calculated efficiently for each sample using hidden Markov models, and the KL divergence can be estimated efficiently. I'll let you know if it works out.

Edit 3: It appears that the use of derivatives can be avoided provided that the probability density belongs to an exponential family with respect to the parameter $\theta$.