Prediction Interval from Markov Chains

239 Views Asked by At

Thank you for taking the time to look at my question.

Short, less involved question: How do you find Prediction Intervals with non-Gaussian residuals?

Here is the situation: I have made a model that predicts a student's grades at the twelve week mark using a discrete time Markov Chain. It uses an empirically constructed stochastic matrix of 80 transition observations. It takes the dot product of the estimated probability vector, P[12], and the percentages associated with each grade to return a scalar percentage estimate.

When testing the model against unused data, I found the residuals to be heavily skewed. They appeared to fit a triangular distribution much more than a normal distribution, using the Freedman-Diaconis method for bins. I also verified this with Anderson-Darling and Kolmogorov-Smirnov tests.

I have only read methods for constructing prediction intervals (what I planned to use to account for the prediction error) with normally distributed residuals. What are alternative methods of constructing prediction intervals for non-Guassian residuals? I am currently using:

Lower Bound: $$Predicted Value - RMSD\left| a + {\sqrt \frac {\alpha (b-a)(c-a)}{2}} \right|$$

Upper Bound: $$Predicted Value + RMSD\left| b + {\sqrt \frac {\alpha (b-a)(b-c)}{2}} \right|$$

The terms to the right of the +/- are the critical values for a triangular distribution with error $\alpha$, min = a, max = b, mode = c where a, b $\neq$ c. I'm positive there are better methods, that account for the variance of the residuals and degrees of freedom.

Also, if anyone can recommend statistic textbooks or articles that emphasize/teach inference without the normality assumption, I'd really appreciate that!