Problem is : Model given $$Y_i=ib+ e \quad; \, i=1,2,3$$
Mean is $0$ and variance is $\sigma, 2\sigma, 3\sigma$. What will be best linear unbiased estimator for $b$?
Answer given in book is $(y_1+y_2+y_3)/6$? this is unbiased I know, but is it correct?
(W)OLS estimator is the BLUE by the Gauss-Markov theorem, which is given by the normal equation (weighted by the inverse of the noise covariance matrix)
$\hat{b}=(X^T\Sigma^{-1}X)^{-1}X^T\Sigma^{-1}y$
Here we have
$X= \left[ {\begin{array}{cc} 1 & 1 \\ 1 & 2 \\ 1 & 3 \\ \end{array} } \right] $
, $y= \left[ {\begin{array}{c} y_1 \\ y_2 \\ y_3 \\ \end{array} } \right] $
and the noise covariance matrix
$\Sigma= \left[ {\begin{array}{ccc} \sigma & 0 & 0 \\ 0 & 2\sigma & 0 \\ 0 & 0 & 3\sigma \\ \end{array} } \right]$ (by homoscedasicity and independent noise assumption, all the off-diagonal entries are zeros)
s.t.,
$X^T\Sigma^{-1}X= \left[ {\begin{array}{cc} \frac{11}{6\sigma} & \frac{3}{\sigma} \\ \frac{3}{\sigma} & \frac{6}{\sigma} \\ \end{array} } \right] $
$(X^T\Sigma^{-1}X)^{-1}= \left[ {\begin{array}{cc} 3\sigma & -\frac{3\sigma}{2} \\ -\frac{3\sigma}{2} & \frac{11\sigma}{12} \\ \end{array} } \right] $
so, we have
$(X^T\Sigma^{-1}X)^{-1}(X^T\Sigma^{-1}y)= \left[ {\begin{array}{cc} 3\sigma & -\frac{3\sigma}{2} \\ -\frac{3\sigma}{2} & \frac{11\sigma}{2} \\ \end{array} } \right] % \left[ {\begin{array}{c} \frac{y_1}{\sigma}+\frac{y_2}{2\sigma}+\frac{y_3}{3\sigma} \\ \frac{y_1}{\sigma}+\frac{y_2}{\sigma}+\frac{y_3}{\sigma} \\ \end{array} } \right] = \left[ {\begin{array}{c} \frac{1}{2}(3y_1 -y_3) \\ \frac{1}{12}(-7y_1+ 2y_2 + 5y_3) \\ \end{array} } \right] $
s.t., $\hat{b_0}=\frac{1}{2}(3y_1 -y_3)$ (intercept) and $\hat{b_1}=\frac{1}{12}(-7y_1+ 2y_2 + 5y_3)$.
The above derivation was for the WOLS coefficients in the general settings (with a bias term). If we drop the bias term in $X$, we shall have
$b = (X^T\Sigma^{-1}X)^{-1}(X^T\Sigma^{-1}y)=\frac{\sigma}{6}.\frac{y_1+y_2+y_3}{\sigma}=\frac{y_1+y_2+y_3}{6}$
Also, another way, if we want to fit WOLS without an intercept in 1D, we need to minimize the SSE (or MSE) $E = \sum\limits_i e_i^2 = \sum\limits_i\frac{(y_i-\hat{y_i})^2}{\sigma_i^2}=\sum\limits_i\frac{(y_i-ib)^2}{\sigma_i^2}$ w.r.t. $b$.
(here sum of errors $\sum\limits_i e_i = 0$ (with the assumption $\textbf{e} \sim N(0,\sigma^2)$)
Setting $\frac{\partial E}{\partial b} = 2\sum\limits_i\frac{(y_i-ib )(-i)}{\sigma_i^2}=0$,
we have $b=\frac{\sum\limits_{i=1}^{3}\frac{iy_i}{\sigma_i^2}}{\frac{\sum\limits_{i=1}^{3}i^2}{\sigma_i^2}}=\frac{\frac{y_1}{\sigma}+\frac{2y_2}{2\sigma}+\frac{3y_3}{3\sigma}}{\frac{1^2}{\sigma}+\frac{2^2}{2\sigma}+\frac{3^2}{3\sigma}}=\frac{y_1+y_2+y_3}{6}$