Let $T(X)$ be an efficient estimator (its variance equates the Cramer-Rao bound) of the parameter $\theta$. A theorem in my course states that it is the only maximum likelihood estimator of $\theta$.
The proof states that :
$$ \frac{\partial p(x,\theta)}{\partial \theta} = I_{X}(\theta)\big[T(X) - \theta \big] $$
and the partial derivative is $0$ if $T(X) = \theta$, but where does this partial derivative come from?
Proving that an efficient estimator is the maximum likelihood estimator
597 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 2 best solutions below
On
Assuming the distribution of $X$ belongs to a one-parameter exponential family satisfying all the regularity conditions of the Cramer-Rao inequality, the condition you have stated is the equality condition of the CR inequality, namely
$$\frac{\partial}{\partial\theta}\ln f_{\theta}(x)=k(\theta)(T(x)-g(\theta))\tag{1}$$
, where $f_{\theta}$ is the pdf/pmf of $X$, $T(x)$ is an estimator of the parametric function $g(\theta)$ and $k$ is some non-zero function of $\theta$. That is to say, variance of $T$ attains the Cramer Rao lower bound for $g(\theta)$.
In the context of maximum likelihood estimation, $\ln f_{\theta}(x)$ is just the log-likelihood given the data $x$.
So roughly speaking, as you set $\frac{\partial}{\partial\theta}\ln f_{\theta}(x)=0$ to find the MLE of $g(\theta)$, condition $(1)$ implies that the MLE must be $\widehat{g(\theta)}=T(x)$.
The short (by reference) answer is that the Schwarz inequality is used to prove the Cramer-Rao bound, and this condition is required for the "equals" part of the Schwarz inequality to hold (and thus guarantee the estimator is efficient). See Van Trees, Part I, Section 2.4, pp. 66-67 for a detailed development: this is Equation (187). I was able to find the text on line here.