What is the right way to compare the accuracy of statistical models that predict a scalar?

17 Views Asked by At

If you are comparing the performance of two machine learning models trained to regress to some scalar quantity based on some input data, you might compute the MSE (mean squared error $E[(\hat{x} - x)^2]$) of each over some validation dataset. Most likely the MSE's will be different. How can you tell if the difference is statistically significant?