I'm reading 'Probability' by Shiryayev, in particular the paragraph 'Estimating the probability of success in the Bernoulli scheme'. The author wants to determine if the estimator (estimator of the probability $\theta$ of success in a trial) defined by:
$$S_n/n=\frac{\text{number of success on n trials}}{n}$$
is efficient, in the sense that:
$$V_\theta (S_n/n)=\inf _{T_n}V_\theta (T_n)$$
where $V_\theta(\cdot)$ is the variance (which depends on $\theta$ itself) of the generic estimator $T_n$.
In his calculations he writes:
$$p_\theta(\omega)= \theta^{\sum_{i=1}^n a_i}(1-\theta)^{n-\sum_{i=1}^na_i}=\prod_{i=1}^n\theta^{a_i}(1-\theta)^{1-a_i}$$
where $\omega=(a_1,a_2,...,a_n)$ (and $a_i\in\{0,1\}$) is an elementary event of the space $\Omega$. Then, he defines:
$$L_\theta(\omega)=\log p_\theta(\omega)$$
and calculates:
$$1=\mathbb{E}_\theta(1)=\sum_{\omega}p_\theta(\omega)$$ $$\theta=\mathbb{E}_\theta(S_n/n)=\sum_{\omega}p_\theta(\omega)S_n(\omega)/n$$ $$\Downarrow$$ $$0=\sum_{\omega}\frac{\partial p_\theta(\omega)}{\partial\theta} =\sum_{\omega}\frac{\frac{\partial p_\theta(\omega)}{\partial\theta} }{p_\theta(\omega)}p_\theta(\omega)=\mathbb{E}_\theta\left( \frac{\partial L_\theta(\omega)}{\partial\theta}\right) $$ $$1=\sum_{\omega}S_n(\omega)/n\frac{\partial}{\partial\theta}p_\theta(\omega)=\mathbb{E}_\theta\left( S_n/n\cdot\frac{\partial L_\theta(\omega)}{\partial\theta}\right)$$
From this, he concludes that:
$$1=\mathbb{E}_\theta\left( (S_n/n-\theta)\cdot\frac{\partial L_\theta(\omega)}{\partial\theta}\right)$$
but I can't see why, because the only things that I can obtain from the two equations above is, subtracting side by side:
$$1-0=\mathbb{E}_\theta\left( S_n/n\cdot\frac{\partial L_\theta(\omega)}{\partial\theta}\right)-\mathbb{E}_\theta\left( \frac{\partial L_\theta(\omega)}{\partial\theta}\right)=\mathbb{E}_\theta\left( (S_n/n-1)\cdot\frac{\partial L_\theta(\omega)}{\partial\theta}\right)$$
Where is the problem here?
Thanks.
if you subtract a zero anyway, why not take that zero as you need it? Say, $$ 1-0=\mathbb{E}_\theta\left( S_n/n\cdot\frac{\partial L_\theta(\omega)}{\partial\theta}\right)-{\color{red}\theta}\cdot\mathbb{E}_\theta\left( \frac{\partial L_\theta(\omega)}{\partial\theta}\right)=\mathbb{E}_\theta\left( (S_n/n-{\color{red}\theta})\cdot\frac{\partial L_\theta(\omega)}{\partial\theta}\right) $$ By the way, the equality that you write at the end is right too. And if we replace $1$ by any other constant, it still remains valid. And the $\theta$ is a most suitable constant since we further intend to use Cauchy–Bunyakovsky–Schwarz inequality and get variance of $S_n/n$ exactly as second moment of $S_n/n-\theta$.