I'm writing code to calculate the following formula from http://people.duke.edu/~rnau/411l696.htm
2. Use lagged versions of the variables in the regression model.
This allows varying amounts of recent history to be brought into the forecast
Lagging of independent variables is often necessary in order for the regression model to be able to predict the future--i.e., to predict what will happen in period t based on knowledge of what happened up to period t-1
Example: instead of regressing Y on X, regress Y on LAG(X,1) and LAG(Y,1)
The regression equation is now Ý(t) = a + bX(t-1) + cY(t-1)
I know that the a is the y intercept, the b is the coefficient for the independent variable but I'm not sure what the c is and how to calculate it. Any thoughts?
The simplest method to calculate $c$ (simulations with $a$ and $b$) is applying the ordinary least squares (OLS) algorithm. Practically, you need to create the a design matrix $\mathbf{X}$ of size $(n-1) \times 3$ and the response vector $\mathbf{y}$. The length of $\mathbf{y}$ will be $n-1$ as you start from $y_2$. The design matrix $\mathbf{X}$ contains three columns, the first one is all $1$s for the intercept, the second is the lagged $x$'s, and the third is the lagged $y$s. For example, the bottom row of $\mathbf{X}$ is $(1, x_{1}, y_{1})$, and the top row is $(1, x_{n-1}, y_{n-1})$. To find the the OLS of $\mathbf{w} = (a, b,c)$ just compute $$ \hat{\mathbf{w}} = \mathbf{ (X^TX)^{-1}X^Ty} $$