Assume out estimator has the form:
$$\hat{Y} = aX+b$$
find the coefficients a and b that minimize the Mean square error for the estimate $\hat{Y}$, that is minimize $\text{MSE}(\hat{Y})$:
$$e = \text{MSE}(\hat{Y}) = E\Big[(Y-\hat{Y})^2\Big]$$
Ok, so I give it a try:
$$\frac{de}{da} = \frac{d}{da} E\Big[(Y-\hat{Y})^2\Big] = 0\tag{1}$$
$$\frac{de}{db} = \frac{d}{db} E\Big[(Y-\hat{Y})^2\Big] = 0\tag{2}$$
Solving for (1):
$$\text{Let}~U = Y - aX -b$$
$$E\Big[\frac{d}{da} U^2\Big] = 0$$
Applying chain rule:
$$E[2U\frac{dU}{da}]=0$$
$$E[U\frac{dU}{da}]=0$$
$$E[(Y-aX-b)(-X)] = 0$$
$$E[X^2]a + E[X]b = E[XY]\tag{3}$$
Solving for (2):
$$\text{Let}~U = Y - aX -b$$
$$E\Big[\frac{d}{db} U^2\Big] = 0$$
Applying chain rule:
$$E[2U\frac{dU}{db}]=0$$
$$E[U\frac{dU}{db}]=0$$
$$E[(Y-aX-b)(-1)]=0$$
$$E[Y-aX-b]=0$$
$$E[X]a+b=E[Y]\tag{4}$$
(3) and (4) gives me a system of equations that I can solve for a and b:
$$E[X^2]a + E[X]b = E[XY]\tag{3}$$
$$E[X]a+b=E[Y]\tag{4}$$
rearrange (4):
$$b=E[Y]-E[X]a\tag{5}$$
subst (5) into (3):
$$E[X^2]a + E[X](E[Y]-E[X]a) = E[XY]$$
$$E[X^2]a + E[X]E[Y]-E[X]E[X]a) = E[XY]$$
$$E[X^2]a + E[XY]-E[X^2]a = E[XY]$$
$$0 = 0$$
Now I know something is wrong because the book says the answer is:
$$a=\frac{E[XY] - E[X]E[Y]}{E[X^2]-(E[X])^2}$$
$$b=E[Y] - a E[X]$$
I'm Wondering why my system of equations is linearly dependent...and I can't solve it to get the book answer...
solve (4) for b:
$$b = E[Y] - a E[X]\tag{5}$$
substitute (5) into (3):
$$E[X^2]a + E[X]\Big[E[Y]-aE[X]\Big] = E[XY]$$
$$E[X^2]a + E[X]E[Y] - a E[X]E[X] = E[XY]$$
$$E[X^2]a + E[X]E[Y] - a (E[X])^2 = E[XY]$$
$$a\Big[E[X^2]- (E[X])^2\Big] = E[XY] - E[X]E[Y]$$
$$a = \frac{E[XY] - E[X]E[Y]}{\Big[E[X^2]- (E[X])^2\Big]}$$