Why are cross expectations zero in MLE?

109 Views Asked by At

Let $z_i = (y_i, x_i')'$ be an i.i.d sample of $N$ observations and let $z_i$ have density of the form: $$f(z_i \mid \theta_0) = f_1(y_i \mid x_i, \theta_0) f_2(x_i \mid \theta_0)$$ Consider the joint MLE estimator: $$\widehat{\theta}_J = \operatorname*{arg\,max}_{\theta \in \Theta} \frac{1}{N} \sum_{i=1}^{N} \ln(f_1(y_i \mid x_i, \theta_0) f_2(x_i \mid \theta_0))$$ Now consider the information matrix $I_J$, which is equal to: $$I_J = -\mathbb{E}\left[\frac{\partial^2 \ln\left(f_1(y_i \mid x_i, \theta_0)f_2(x_i \mid \theta_0) \right)}{\partial \theta \, \partial \theta'} \right] \\ = \mathbb{E}\left[\frac{\partial \ln\left(f_1(y_i \mid x_i, \theta_0)f_2(x_i \mid \theta_0)\right) }{\partial \theta} \frac{\partial \ln\left(f_1(y_i \mid x_i, \theta_0)f_2(x_i \mid \theta_0)\right)}{\partial \theta'} \right] $$ One can further decompose the previous expression as: $$I_J = \mathbb{E}\left[\left(\frac{\partial \ln\left(f_1(y_i \mid x_i, \theta_0)\right)}{\partial \theta} +\frac{\partial \ln\left(f_2(x_i \mid \theta_0)\right)}{\partial \theta}\right) \left(\frac{\partial \ln\left(f_1(y_i \mid x_i, \theta_0)\right)}{\partial \theta'} +\frac{\partial \ln\left(f_2(x_i \mid \theta_0)\right)}{\partial \theta'}\right)\right] \\ = \mathbb{E}\left[\frac{\partial \ln\left(f_1(y_i \mid x_i, \theta_0)\right)}{\partial \theta} \frac{\partial \ln\left(f_1(y_i \mid x_i, \theta_0)\right)}{\partial \theta'} \right] + \mathbb{E}\left[\frac{\partial \ln\left(f_1(y_i \mid x_i, \theta_0)\right)}{\partial \theta} \frac{\partial \ln\left(f_2(x_i \mid \theta_0)\right)}{\partial \theta'}\right] \\ {} + \mathbb{E}\left[\frac{\partial \ln\left(f_2(x_i \mid \theta_0)\right)}{\partial \theta} \frac{\partial \ln\left(f_1(y_i \mid x_i, \theta_0)\right)}{\partial \theta'}\right] + \mathbb{E}\left[\frac{\partial \ln\left(f_2(x_i \mid \theta_0)\right)}{\partial \theta} \frac{\partial \ln\left(f_2(x_i \mid \theta_0)\right)}{\partial \theta'} \right] $$ I am told that the cross product expectations are equal to zero, that is, $$\mathbb{E}\left[\frac{\partial \ln\left(f_1(y_i \mid x_i, \theta_0)\right)}{\partial \theta} \frac{\partial \ln\left(f_2(x_i \mid \theta_0)\right)}{\partial \theta'}\right]= \mathbb{E}\left[\frac{\partial \ln\left(f_2(x_i \mid \theta_0)\right)}{\partial \theta} \frac{\partial \ln\left(f_1(y_i \mid x_i, \theta_0)\right)}{\partial \theta'}\right]= 0 $$ Why is this the case?

1

There are 1 best solutions below

0
On BEST ANSWER

I'm going to try and clean this up later, but it should be correct. There are some regularity conditions on the score function of $f_1(y|x,\theta)$ that I need to show.

We want to show that : \begin{equation} \mathbb{E}\left[\frac{\partial \ln\left(f_1(y_i \mid x_i, \theta_0)\right)}{\partial \theta} \frac{\partial \ln\left(f_2(x_i \mid \theta_0)\right)}{\partial \theta'}\right]= \mathbb{E}\left[\frac{\partial \ln\left(f_2(x_i \mid \theta_0)\right)}{\partial \theta} \frac{\partial \ln\left(f_1(y_i \mid x_i, \theta_0)\right)}{\partial \theta'}\right]= 0 \end{equation}

The derivative of the cross terms is:

\begin{equation} \mathbb{E}\left[\frac{1}{f_1(y|x,\theta)f_2(x|\theta)}\frac{\partial}{\partial \theta}f_1(y|x,\theta)\frac{\partial}{\partial \theta'}f_2(x|\theta')\right] \hspace{2mm} \& \hspace{2mm}\mathbb{E}\left[\frac{1}{f_1(y|x,\theta)f_2(x|\theta)}\frac{\partial}{\partial \theta'}f_1(y|x,\theta')\frac{\partial}{\partial \theta}f_2(x|\theta)\right] \end{equation}

If you take the expectation of $h(x,y)$ you're evaluating $\int_{\mathcal{X,Y}}h(x,y)f_1(y|x,\theta)f_2(x|\theta)$. Canceling out the $f_2(x|\theta)$ in the expectation and the fractional part of the partial derivative:

\begin{equation} \int_{\mathcal{X,Y}}\frac{\frac{\partial}{\partial \theta}f_1(y|x,\theta)}{f_1(y|x,\theta)}\frac{\partial}{\partial \theta'}f_2(x|\theta')f_1(y|x,\theta) dxdy \hspace{2mm} \& \hspace{2mm} \frac{\frac{\partial}{\partial \theta'}f_1(y|x,\theta)}{f_1(y|x,\theta)}\frac{\partial}{\partial \theta}f_2(x|\theta)f_1(y|x,\theta) dxdy \end{equation} Taking first the integral with respect to $Y$:

\begin{equation} \int_{\mathcal{X}}\frac{\partial}{\partial \theta'}f_2(x|\theta)\int_{\mathcal{Y}}-V_{f_1}(\theta',X)f_1(y|x,\theta) dydx \hspace{2mm} \& \hspace{2mm} \int_{\mathcal{X}}\frac{\partial}{\partial \theta'} f_2(x|\theta)\int_{\mathcal{Y}}-V_{f_1}(\theta,X)f_1(y|x,\theta) dydx \end{equation}

we have that the $Y$ integral is the integral of the score function of $f_1$, which, under some regularity conditions is equal to 0.

This doesn't work on the cross terms because when you evaluate the derivatives instead of getting the score function you obtain the product of two score functions.

Edit: Regularity conditions for expected score to be 0.

  1. For all $\theta \in \Theta$ where $\Theta$ denotes the parameter space the density is integrable.
  2. Almost surely the score exists, or almost surely the log likelihood is differentiable in $\theta$.
  3. There is an integrable function $g(\theta)$ such that for all $\theta$ and $Y$ almost surely $|-V(\theta,Y)| \leq g(\theta)$

I was trying to give more explicit conditions such that the third property was satisfied, but was having some trouble. It should be related to the existence of a unique MLE or something to that effect.