Suppose $X$ and $Y$ are random variables and let $x_1, ..., x_n$ be observed values from a random sample of $X$. Assume that $Y_i = 2\alpha x_i + \alpha + \beta_i$ where $\alpha$ is unknown and $\beta_1, ..., \beta_n$ are iid. $N(0, \sigma^2)$ with $\sigma^2$ being unknown (Equivalently assume that the conditional expectation of $Y$ depends linearly on $X$ and that the slope of the line is double the y-intercept).
(i) Determine the maximum likelihood estimators for $\alpha$ and $\sigma^2$.
(ii) You take $3$ samples and observe $(x_1, y_1) = (1, 4)$, $(x_2, y_2) = (0, 1)$, and $(x_3, y_3) = (3, 6)$. Find the point estimate for $\alpha$ using the MLE you found in (i).
(iii) What are the residuals of this model and what do they measure?
Hint: The MLE of $\sigma^2$ is the average of the squares of the residuals.
My attempt:
(i) I used this likelihood function
$$L(\alpha, \sigma^2) = \prod_{i=1}^{n} \frac{1}{\sqrt{2\pi \sigma^2}} exp(\sum_{i=1}^{n}\frac{-(y_i - 2\alpha x_i - \alpha)^2}{2\sigma^2})$$
The negative of the natural log of this is
$$-ln(L(\alpha, \sigma^2)) = \frac{n}{2}\ln(2\pi \sigma^2) + \sum_{i=1}^{n}\frac{(y_i - 2\alpha x_i - \alpha)^2}{2\sigma^2}$$
We let the latter half of the RHS be another function $G$ so that:
$$G(\alpha) = \sum_{i=1}^{n}\frac{(y_i - 2\alpha x_i - \alpha)^2}{2\sigma^2}$$
Minimizing this w.r.t. $\alpha$ gives
$$G' = \sum_{i=1}^{n} 2(y_i - 2 \alpha x_i - \alpha)(-2x_i - 1) = 0$$
And so the MLE for $\alpha$ is
$$\hat{\alpha} = \frac{\sum_{i = 1}^{n}4Y_ix_i + 2Y_i}{\sum_{i = 1}^{n}8x_i^2 + 8x_i + 2}$$
We check the second partial derivative:
$$G'' = \sum_{i=1}^{n} 8x_i^2 + 8x_i + 2$$
This is greater than $0$ so it is a minimum.
Now, minimizing $-\ln(L)$ w.r.t. $\sigma^2$ gives
$$-ln(L(\alpha, \sigma^2))' = \frac{n}{2\sigma^2} - \frac{1}{2\sigma^4}\sum_{i=1}^{n}(y_i - 2\alpha x_i - \alpha)^2 = 0$$
And so the MLE for $\sigma^2$ is
$$\hat{\sigma}^2 = \frac{1}{n}\sum_{i = 1}^{n} (y_i - 2\alpha x_i - \alpha)^2$$
We check the second partial:
$$-ln(L(\alpha, \sigma^2))'' = \frac{-n}{2\sigma^4} + \frac{1}{\sigma^6}\sum_{i=1}^{n}(y_i - 2\alpha x_i - \alpha)^2$$
This is greater than $0$ $\iff$ $\sum_{i=1}^{n}(y_i - 2\alpha x_i - \alpha)^2 > n \sigma^2$
However, I am not sure how to show that this is true.
(ii) The MLE from (i) was
$$\hat{\alpha} = \frac{\sum_{i = 1}^{n}4Y_ix_i + 2Y_i}{\sum_{i = 1}^{n}8x_i^2 + 8x_i + 2}$$
So
$$\alpha = \frac{(4 \cdot 4 \cdot 1 + 2\cdot 4) + (4\cdot 1\cdot 0 + 2\cdot 1) + (4\cdot 6\cdot 3 + 2\cdot 6)}{(8 \cdot 1^2 + 8\cdot 1 + 2) + (8 \cdot 0^2 + 8 \cdot 0 + 2) + (8 \cdot 3^2 + 8\cdot 3 + 2) } = \frac{110}{118}$$
(iii) Going by the hint, the residual is $Y_i - 2\alpha x_i - \alpha$. It is the difference between the actual and observed data, and is used to determine whether a line or curve is appropriate for the data.
Any assistance especially with the last part of (i) is appreciated. Was my explanation about residuals correct?
We check the second partial:
(i) Plug in the MLE of $\sigma^2$
\begin{align} -\ln(L(\alpha, \sigma^2))'' &= \frac{-n}{2\sigma^4} + \frac{1}{\sigma^6}\sum_{i=1}^{n}(y_i - 2\alpha x_i - \alpha)^2\\ & = \frac{-n}{2\hat{\sigma}^4} + \frac{n}{\hat\sigma^6}\frac{1}{n}\sum_{i=1}^{n}(y_i - 2\hat\alpha x_i - \hat\alpha)^2\\ & = \frac{-n}{2\hat{\sigma}^4} + \frac{n}{\hat \sigma^4} \\ & = \frac{n}{2\hat \sigma^4} > 0, \end{align}
Generally, yes. The $i$th residual is, in some sense, an estimator of $\beta_i$, since $\beta_i = y_i - 2 a_i x_i - a_i$, hence by replacing $y_i, a_i$ with their MLE you have the MLE of $\beta_i$.