I performed a linear regression model y~x in R by running
lr <- lm (y~x, data = train)
p <- predict (lr, newdata = test)
error <- p - test$y
It seems that error does not have zero mean (in fact, mean(error) = -5), it is right-skewed and I could not say it belongs to any kind of distribution I know.
Furthermore, I tried to replace lm with rlm(MASS), but the RMSE and MAE of the prediction does not increase.
What is some alternative approaches to explore in this situation? I am thinking of a problem that error term (\epsilon) is not independence from x, but not sure how to model that?
Thanks