Making a prediction on next value from a previous set

2.4k Views Asked by At

I'm looking for some guidance in how to approach this problem mathematically. I'm not even sure what to Google to find the answer.

I have a set of values from the past 30 days - [2,5,8,5,9,15,20,12,etc]. I would like to extrapolate the next several values that might occur in the sequence. At its simplest, taking an average of the 30 entries and having that as the prediction could work, but I'd like to notice growth/change and extend accordingly, to get a prediction of value in 2 days or 10 days. I'm not really looking for pattern matching, just a trend line extrapolation.

I thought I could graph the data, put a line of best fit, and simply extend it. The issue is that I can't find much reference to the math one would use for calculating such information. Does one have to create a graph and read the figures off the line of best fit, or could one create a function in say PHP to calculate the next few predicted values from the trend line? I'm looking for an extended trend line basically, like Excel can do.

Many many thanks

Sam

3

There are 3 best solutions below

0
On

I guess you are looking for regression. With regression, you fit a line through your data, which you can use to predict future values of your series. The simplest form of regression is linear regression, which tries to fit a straight line through your data points, to represent a trend.

More literature can be found as well when you look for time series analysis.

0
On

Lets say that for some reason you decided to fit $x_t = \beta + \beta_1 x_{t-1}+\epsilon_t$. So, to perform manually "LS optimization" you have to construct the following matrices: $$ y=(x_t, x_{t-1},...,x_{t-29})' $$
and $X$ matrix \begin{pmatrix} 1 & x_{t-1}\\ 1 & x_{t-2}\\ : & : \\ 1 & x_{t-30} \end{pmatrix} Hence your model can be written as $y=X\beta+\epsilon.$ As such, the OLS estimators for the coefficients are given by $$ \hat{\beta} = (X'X)^{-1}X'y. $$

0
On

I believe what are trying to do, in general is to come up with a 'best fit' model for your data to try and make predictions. Speaking generally about this question, it is far from easy. Without any additional knowledge about the process responsible for generating that data, you have no real ability to predict what it will do in the future. You have to make some assumptions, the question is: which assumptions make sense in the case you are interested in? This comes down to using what you know (or suspect if knowledge is scant) about the process that is generating the data of interest. Maybe it makes sense that the data would roughly follow a linear trend (in which case linear regression would make sense), but maybe not. It is often very helpful to plot the data you have to help you see any patterns of interest in it (which in turn may enlighten your choice of model to fit the data too).

For the present case, without us being able to either see your data (a plot would probably suffice) and/or you sharing with us what you know about the process responsible for that data. We cannot really offer any specific guidance beyond the above.