How to fit a cumulative time series?

257 Views Asked by At

Assume we have a time series of $N$ different points $(t_i,y_i)$, for $i \in \{1,..N\}$. The number $N$ is "small" (let's say $3<N<10$, just to give an idea), so you may want to make good use of all the information in the data.

Define now a cumulative time series, namely

$$(t_1,y_1),(t_2,y_1+y_2),\ldots ,(t_N,y_1+y_2+\cdots +y_N).$$

I would like to estimate the slope of this "cumulated" series of data (and the uncertainty on this slope), which can be naively estimated as

$$\dfrac{y_2+\cdots +y_N}{t_N-t_1} \qquad \text{or}\qquad \dfrac{y_1+\cdots +y_{N-1}}{t_N-t_1} $$

However, apart from not knowing which of the two options is a better estimator of the slope (they both look legit), we also have no idea of the associated uncertainty. One option is to make a linear regression on the points of the cumulative. However, in this way, the information provided by $y_1$ (or $y_N$) is lost!

Which is the best (or standard) way to perform a linear regression in this case (or to estimate the slope of the cumulated data set)? Moreover: is it even possible to use the usual least mean squares method? (I guess not, since the cumulated data do not have the homoscedasticity property).

Answer to the question: This question is closely related to How to infer the average speed of a frog?. After some research, I concluded that a satisfying answer can be found in Fitting a Straight Line to Certain Types of Cumulative Data (1957), Parameter Estimation with Cumulative Errors (1974) and the methods described in Statistical estimates of the pulsar glitch activity (2021).

1

There are 1 best solutions below

1
On BEST ANSWER

This really should be a comment, but don't have enough reputation. Try Holt's method (double exponential smoothing).

The formulation looks like it'll fit your problem domain. In the absence of a trend, even exponential smoothing will work pretty well. A lot of statistical packages have built-in methods for this.