What does it really mean to take correlation between time series?

68 Views Asked by At

I have a conceptual problem when we extend the correlation to time series. I understand probability and statistics as a two way route. Either I begin from a random variable (r.v.) $X$ and sample from its probability distribution, or I observe a sample of values $x_1,...,x_n$ drawn from that distribution and I infer something about the distribution of the r.v. $X$.

When I have two r.v. $X$ and $Y$, I can establish the correlation between them (theoretically, based on an expected value). If I have the observations (samples) $x_1,...,x_n$ and $y_1,...,y_n$, I can estimate the correlation from the specified formula.

When I have a time series, though I have timed indexed observations $x_1,..,x_n$ which are not a sample, because at each time I observe a different r.v. A random process is a function $X(\omega, t)$, and for every state $\omega$ I have a whole trajectory. How am I supposed to conceptualize a correlation between two trajectories?

Here lies the problem: I get that autoccorelation can give me information regarding the dependence between each time point and the next (with lag one), but don't really get how we interpret this in general, or even if we should call it the same. I can't grasp the probabilistic meaning of taking the correlation (sample) of two time series, when there is no single r.v. that is being observed (different from the usual correlation used in a static context), or if this would only have probabilistic meaning under the assumption of strong stationarity, in which at every time point $x_t$ I have equal distribution?

To illustrate a little further, I will reference my previous question in this same line, regarding the meaning of generating two correlated brownian paths: Understanding correlation in the context of a time series, simulation and brownian motion. Also, I want to point simulation, because it shows that only when I simulate multiple scnearios (Take trajectories as columns of a matrix) I am able to estimate the correlation of two r.v. $X_{t_i}$ and $X_{t_j}$ (taking correlation between rows $i$ and $j$), because only then I have a sample of each r.v.

Thanks in advance.