Intuitively, if we know part of the true parameter values, we shall obtain a more efficient estimator (REML) than MLE. I am having mathematical difficulties in proving the difference of the two variance matrices is positive definite. The setup is as follows:
Let $\bar{\ell}_{T}(\theta)=T^{-1} \sum_{t=1}^{T} l_{t}(\theta)$ be the sample average of a log-likelihood function. Then we know MLE is: $$ \hat{\theta} \equiv \underset{\theta \in \Theta}{\operatorname{argmax}} \bar{\ell}_{T}(\theta) $$ while the restricted MLE is $$ \hat{\theta}_{R} \equiv \underset{\theta \in \Theta: \theta_{2}=\theta_{02} }{\operatorname{argmax}} \bar{\ell}_{T}(\theta) $$ where the true $\theta_{0}=\left(\theta_{01}, \theta_{02}\right)$ and $\theta_{02}$ is an $m$ -dimensional subvector of $\theta \in \mathbb{R}^{k}$ and $\theta_{02}$ is known.
Now I would like to show that REML is efficient relative to MLE asymptotically. In other words, suppose $\sqrt{T}\left(\hat{\theta}_{R}-\theta_{0}\right) \stackrel{d}{\rightarrow}$ $N\left(0, V_{R}\right)$ and $\sqrt{T}\left(\hat{\theta}-\theta_{0}\right) \stackrel{d}{\rightarrow} N(0, V)$. Need to show $V-V_{R}$ is positive semidefinite.
It will be the difference of two Fisher information matrix. But how to show they are PSD?