In the book ESL (Element of Statistical Learning), the author introduces the EPE (Expected prediction Error) and the MSE (Mean Squared Error). I know that the EPE is defined as:
$$EPE(f)=E(Y-f(X))^2$$
which is the expected value generated on all the different training data set.
But what abount the MSE? The author defined the MSE like:
$$MSE(x_0) = E_T[f(x_0)-\widehat{y_0})]^2$$
and i really don't get the difference... The question is: basically speaking, what's the difference between EPE and MSE?
The first one is calculating the error of the predictor function (over all possible values of $(X,Y)$) with a commonly used Loss function $L(Y,f(X))=(Y-f(X))^2$. So,
$$EPE(f(X))=E(L(Y,f(X))=E[(Y-f(X))^2]$$ This loss function is chosen arbitrarily and we could choose a different loss function if it suited our needs. For example we could choose the following loss function: $L(Y,f(X))=|Y-f(X)|$, then $$EPE(f(X))=E(L(Y,f(X))=E[|Y-f(X)|]$$
So, in some way, the term "expected prediction error" depends on the loss function.
The second one is calculating the expected error of $\hat{y_0}$, which is calculated in the following way:
So now you have the expected mean squared error (averaged over all training sets) of the point $x_0$.
The term "Mean Squared Error" is always the average of the sum of squared errors, independent of anything (as per definition of MSE).