Appending a regression equation, as more data becomes available.

33 Views Asked by Bumbble Comm At 29 Mar 2026 - 4:00

I am working on a project where I use multilinear regression on many large data sets. Each data point is formatted as (time, data). Lets say I calculate a regression polynomial based on the past month of data, then a week passes and I want to come up with an updated regression polynomial. Can I use the past week's data and the old regression polynomial to come up with an updated polynomial? Would this be algorithmically faster than doing multilinear regression on the full month+week's worth of data?

Considering the amount of data I'm processing, any processor time I can save is essential.

For more background I am planning on just using typical least square regression. Not sure if this makes a difference.

Original Q&A

There are 1 best solutions below

Bumbble Comm On 02 Jul 2017 - 9:28 BEST ANSWER

Say you're using the OLS estimator $\hat\beta=(X^T X)^{-1}X^T y$, where $X\in\mathbb R^{n\times p}$. There are two ways that updating for a new point is faster than fitting the entire thing again:

Keep the sufficient statistic $\hat\Sigma=X^T X$ and $\hat\gamma = X^T y$. Updating $\hat\gamma$ can be clearly done in $O(p)$ time. Updating $\hat\Sigma$ and $\hat\Sigma^{-1}$ is trickier: one might need to use the Sherman-Morrison formula. In general the updating can be done in $O(p^2)$ time, compared to the raw $O(np^2+p^3)$ fitting time.
Suppose you're using gradient descent or its variants to fit the OLS model. Then under the assumption that the model does not drift too much over time, you may want to do warm start (using the last fitted model as initializer in the new model). This way saves time.

Appending a regression equation, as more data becomes available.

There are 1 best solutions below

Related Questions in STATISTICS

Related Questions in DATA-ANALYSIS

Trending Questions

Popular # Hahtags

Popular Questions