How do I calculate the prediction interval for a data set in python?

76 Views Asked by At

I have a data set taken from real measurements that I have modeled with a simple univariate linear regression in python using scipy.stats.linregress, so the model is of the form $$y=\beta_0+\beta_1x$$ My goal is to be able to predict the response $\hat{y}$ with a new measured value $\hat{x}$. Besides just plugging $\hat{x}$ into my model for a prediction, I would like to include the prediction interval to indicate the uncertainty in the prediction.

I have found this equation but I’m not sure I’m implementing it correctly since I’m getting a value that seems larger than what I would expect from my understanding of the prediction interval. Particularly, the t-statistic is something that is confusing for me.

If there is just a package that can do the calculation for me, please point me in the right direction because I didn’t find anything in the libraries I’m familiar with, though I am still a python novice.

Code:

import scipy.stats as sps
import numpy as np

x = np.array([0.506, 0.55, 0.479, 0.637, 0.558, 0.685, 0.508, 0.573, 0.612, 0.263, 0.366, 0.437,
 0.668, 0.506, 0.42, 0.341, 0.35, 0.544, 0.528, 0.513, 0.515, 0.399, 0.585, 0.499,
 0.488, 0.415, 0.512, 0.514, 0.468, 0.464, 0.35, 0.516, 0.459, 0.443, 0.497, 0.506,
 0.525, 0.408, 0.509, 0.285, 0.436, 0.509, 0.489])
y = np.array([134., 134.8, 128.8, 148.9, 140.7, 155.1, 132.5, 141.3, 146.5, 90.9, 117.7, 128.1,
 152.6, 134.5, 119.6, 101.1, 108.9, 137.7, 134., 130.6, 130.9, 115.4, 140.9, 132.6,
 129.1, 117.8, 137.4, 134.7, 130., 128.2, 116.3, 134.3, 127.1, 124.6, 129.2, 133.6,
 136.9, 115.9, 136.2,  98.6, 123.5, 129., 130.6])

model = sps.linregress(x,y)

new_x = 0.55

y_predictions = model[0] * x + model[1]
mse = np.mean((y - y_predictions)**2)
tss = ((x - np.mean(x))**2).sum()
x_mean = np.mean(x)
n = len(x)
t = (x_mean - new_x) / (np.std(x) / np.sqrt(n))

prediction_interval = t * np.sqrt(mse * (1 + 1.0 / n + (new_x - x_mean)**2 / tss))

print(prediction_interval) # = -14.1659827752476