How to calculate MSE?

747 Views Asked by At

I've got to solve the following problem:

Bob fitted a linear regression and figured out that his predicted value is 0.5 more than the actual one for 400 points of the test data set and 0.7 less than the actual one for 100 points of the test data set. Thus, there are 500 observations in total. Calculate Bob's MSE.

At the same time, Anna claims that Bob's model is wrong. She thinks that the quality of the model can be increased by changing all the predicted values by some constant. Calculate Anna's MSE assuming that she found the lowest MSE based on the her given constraints.

So I decided to calculate MSE using this formula: $$ \text{MSE} = \frac{1}{N} \sum (y_i-\hat{y}_i)^2 $$

As a result, in case of Bob, I got \begin{align} & \frac{1}{400} \sum_1^{400} (y_i - (y_i + 0.5))^2 + \frac{1}{100} \sum_1^{100} (y_i - (y_i - 0.7))^2 \\[8pt] = {} & \frac{1}{400} \cdot 400 \cdot (-0.5)^2 + \frac{1}{100} \cdot 100 \cdot (0.7)^2 = 0.74 \end{align}

In case of Anna, I thought it should be $$ \frac{1}{400} \sum_1^{400} (y_i - (y_i + 0.5 + a))^2 + \frac{1}{100} \sum_1^{100} (y_i - (y_i - 0.7 + a))^2 .$$ I took the derivative and got $ a = 0.1 $, which makes $ \text{MSE} = 0.72 $.

However, I was told that the solution is incorrect. I can't seem to figure out where I went wrong. I would really appreciate it if someone could help me with that!

2

There are 2 best solutions below

0
On

You shouldn't be scaling the grouped terms individually. The original equation simply sums them all and then divides the total by the sample size.

For Bob: $$ \frac{1}{500} \left( \sum_1^{400} (y_i - (y_i + 0.5))^2 + \sum_1^{100} (y_i - (y_i - 0.7))^2 \right) = \frac{1}{500} \left( 400(-0.5)^2 + 100(0.7)^2 \right) = 0.298$$

For Alice: Using the same approach as you, first verify that $\frac{d^2}{da^2}MSE(a) = 2 > 0$, so indeed finding the critical point is the way to find the minimum. So, setting $\frac{d}{da} MSE(a) = 2(0.54 + a) = 0$, we get $a=-0.54$. Then, Alice's MSE becomes $0.0064$.

0
On

In simple linear regression in which one fits only a slope and an intercept, one often divides the sum of the $N$ squares of residuals by $N-2$ rather than by $N$ to get what is then called the mean squared error. But that won't actually affect the bottom line in the second question, but only in the first.

$$ \text{MSE} = \frac{400(-0.5)^2 + 100(0.7)^2} {500-2} = \frac{149}{498} $$

It should be noted that ordinary least square will not yield the result reported here, since the sum of the residuals in ordinary least squares is zero.

Now look at what Anna does:

\begin{align} & \text{sum of squares of residuals} \\[8pt] = {} & 400(a-0.5)^2 + 100(a+0.7)^2 \\[8pt] = {} & 500a^2 -498a + 149 \\[8pt] = {} & 500\left( a^2 - 2\cdot\frac{249}{500} a + \left( \frac{249}{500} \right)^2 \right) + 149 - \frac{249^2}{500} \\[10pt] = {} & 500\left( a - \frac{249}{500} \right)^2 + 149 - \frac{249}{500} \end{align} This is minimized by $a= \dfrac{294}{500},$ making the first term $0$ and the rest equal to the new sum of squares of residuals, which is smaller than $149.$