Let the observed random variables $X_1,X_2$ be independent and normally distributed with zero mean and variance respectively $\sigma_1^2$ and $\theta \sigma_1^2$, where $\theta \in \mathbb{R}$.
The model I am considering consist of the set of probability densities given by $$\left \{ \frac{1}{\sqrt{2 \pi X_1^2}} \exp \left( \frac{-1}{2 } \right) \frac{1}{\sqrt{2 \pi \beta X_1^2}} \exp \left( \frac{-X_2^2}{2 \beta X_1^2} \right) \Bigg| \beta \in \mathbb{R} \right \}$$
I want to minimize the Kullback-Liebler information criterion that is given by
$$E\left[ \log \left( \frac{\frac{1}{\sqrt{2 \pi \sigma_1^2}} \exp \left( \frac{-X_1^2}{2 \sigma_1^2} \right)\frac{1}{\sqrt{2 \pi \theta \sigma_1^2}} \exp \left( \frac{-X_2^2}{2 \theta \sigma_1^2} \right)}{ \frac{1}{\sqrt{2 \pi X_1^2}} \exp \left( \frac{-1}{2 } \right) \frac{1}{\sqrt{2 \pi \beta X_1^2}} \exp \left( \frac{-X_2^2}{2 \beta X_1^2} \right)} \right) \right]$$
with respect to $\beta$. The resulting $\beta^*$ should be the value that minimizes the "distance" between the model and the true distribution. I would expect $\beta^* = \theta$ since $X_1^2$ is an unbiased estimator for $\sigma_1^2$.
The problem is quickly reduced to minimizing
$$E\left[ \frac{1}{2} \log \beta + \frac{X_2^2}{2 \beta X_1^2} \right] = \frac{1}{2} \log \beta + \theta \sigma_1^2 E\left[\frac{1}{2 \beta \sigma_1^2 Y_1^2}\right]$$
where $Y_1 = X_1 / \sigma_1$ is distributed as a standard normal random variable (the independence has been used here). The expectation of an inverse chi squared is derived here and is $\frac{1}{\nu -2}$ (where $\nu$ are the degrees of freedom, in our case $\nu = 1$), so we obtain
$$ \frac{1}{2} \log \beta - \frac{\theta}{2 \beta} $$
taking the derivative wrt $\beta$ and imposing the equality with zero one obtains
$$\frac{1}{\beta} + \frac{\theta}{\beta^2} = 0 \implies \beta^* = - \theta$$
Unfortunately this makes no sense. Have I made a sign mistake or a deeper mistake involving the understanding of the Kullback Liebler criterion? A good writeup on the Kullback-Liebler information criterion is available in The Quasi Maximum Likelihood Method Theory by Chung Ming Kuan.
The mistake is that $E\left\{\frac{1}{X_1^2}\right\}$ does not exists. Please refer to this answer for details.