Calculating the Kullback Lieber information Criterion (Quasi Maximum Likelihood Method)

140 Views Asked by At

Let the observed random variables $X_1,X_2$ be independent and normally distributed with zero mean and variance respectively $\sigma_1^2$ and $\theta \sigma_1^2$, where $\theta \in \mathbb{R}$.

The model I am considering consist of the set of probability densities given by $$\left \{ \frac{1}{\sqrt{2 \pi X_1^2}} \exp \left( \frac{-1}{2 } \right) \frac{1}{\sqrt{2 \pi \beta X_1^2}} \exp \left( \frac{-X_2^2}{2 \beta X_1^2} \right) \Bigg| \beta \in \mathbb{R} \right \}$$

I want to minimize the Kullback-Liebler information criterion that is given by

$$E\left[ \log \left( \frac{\frac{1}{\sqrt{2 \pi \sigma_1^2}} \exp \left( \frac{-X_1^2}{2 \sigma_1^2} \right)\frac{1}{\sqrt{2 \pi \theta \sigma_1^2}} \exp \left( \frac{-X_2^2}{2 \theta \sigma_1^2} \right)}{ \frac{1}{\sqrt{2 \pi X_1^2}} \exp \left( \frac{-1}{2 } \right) \frac{1}{\sqrt{2 \pi \beta X_1^2}} \exp \left( \frac{-X_2^2}{2 \beta X_1^2} \right)} \right) \right]$$

with respect to $\beta$. The resulting $\beta^*$ should be the value that minimizes the "distance" between the model and the true distribution. I would expect $\beta^* = \theta$ since $X_1^2$ is an unbiased estimator for $\sigma_1^2$.

The problem is quickly reduced to minimizing

$$E\left[ \frac{1}{2} \log \beta + \frac{X_2^2}{2 \beta X_1^2} \right] = \frac{1}{2} \log \beta + \theta \sigma_1^2 E\left[\frac{1}{2 \beta \sigma_1^2 Y_1^2}\right]$$

where $Y_1 = X_1 / \sigma_1$ is distributed as a standard normal random variable (the independence has been used here). The expectation of an inverse chi squared is derived here and is $\frac{1}{\nu -2}$ (where $\nu$ are the degrees of freedom, in our case $\nu = 1$), so we obtain

$$ \frac{1}{2} \log \beta - \frac{\theta}{2 \beta} $$

taking the derivative wrt $\beta$ and imposing the equality with zero one obtains

$$\frac{1}{\beta} + \frac{\theta}{\beta^2} = 0 \implies \beta^* = - \theta$$

Unfortunately this makes no sense. Have I made a sign mistake or a deeper mistake involving the understanding of the Kullback Liebler criterion? A good writeup on the Kullback-Liebler information criterion is available in The Quasi Maximum Likelihood Method Theory by Chung Ming Kuan.

2

There are 2 best solutions below

0
On

The mistake is that $E\left\{\frac{1}{X_1^2}\right\}$ does not exists. Please refer to this answer for details.

0
On

There mistake in your reasoning is that the model you have specified doesn't make sense. Your model is not a valid probability density function as it in not normalizable because, under your model, the integral $$\int f(x_1,x_2|\beta)\mathrm d x_1 \mathrm d x_2$$ does not converge.

This is the reason why you get the term $$\mathbf{E}\left[\frac{X_2^2}{X_1^2}\right]$$ the KL divergence, which does not exist (it is the mean of a Cauchy distribution).