Root-mean-square error is frequently used in for calculating the error between a predicted value and actual value. The formula for RMSE is given below:
$\mathrm{RMSE} = \sqrt{\frac{\sum_{t=1}^{n}{(y_t - \hat{y}_t)^2}}{n}}$
My question is; why we raise the absolute error to the second power (and then calculate the square-root of the whole thing), but not something else(e.g., 3 or 4)? Is it just a convention, or there is a mathematical explanation for it? Thanks.
You can have other powers: http://en.wikipedia.org/wiki/Power_mean (I know, Wikipedia, but this article looks pretty good). Two has been chosen since it has nice properties (e.g. the standard deviation is the RMS of the deviations, and that worked out well).