Correlation of a vector generated and its one-period lag, both generated using AR(1) data

83 Views Asked by At

Suppose that $C_0$ is $100$ and $\{e_t\}_{t\geq 1}$ is a sequence of i.i.d. standard normal random variables. We generate $C_t=C_{t-1}+e_t$ for $t\geq 1$ and set $$ x_t=C_t^2-C^2_{t-1},\quad z_t=C^2_{t-1}-C^2_{t-2}\quad\text{for}\quad t\geq 2. $$ For fixed $T\geq 2$, let us define $X=(x_2,\ldots,x_T)$ and $Z=(z_2,\ldots,z_T)$. How do we show that the sample correlation between $X$ and $Z$ $$ \rho_T\equiv\frac{\sum_{t=2}^T(x_t-\bar{X})(z_t-\bar{Z})}{\sqrt{\sum_{t=2}^T(x_t-\bar{X})^2\sum_{t=2}^T(z_t-\bar{Z})^2}} $$ converges in probability to $0$ as $T\to\infty$?

This is a claim contained in Nelson and Startz 1990 article in Journal of Business. I can't prove it nor do I understand the intuition behind this. I have however written a simple Matlab script below that seems consistent with the claim. Thank you very much for your help. (If you have the Parallel Computing Toolbox, you can use parfor for the inner loop.)

clc; clear;
tic;
L = [10,100,200,300,400,500,600,700,800,900,1000]; % length of AR process
l = length(L);
results = zeros(l,1);
n = 10000;                         % no. of rep's per length choice
c0 = 100;                          % initial value

for i = 1:l
    T = L(i);                      % current length
    temp1 = zeros(n,1);            % store correlations for the n reps
    for j = 1:n
        e = randn(T,1);            % shocks
        c = filter(1,[1 -1],e,c0); % AR(1) process initial c0
        d = c.^2;                  % store squares of c's       
        x = d(2:T) - d(1:T-1);     % x_t=c_t^2-c_{t-1}^2
        z = [d(1) - c0^2;x(1:T-2)];% z_t=x_{t-1}
        temp1(j) = corr(x,z);      % correlation for rep j
    end
    results(i) = mean(temp1);      % average correlation across n reps
end
plot(L,results);
toc;
1

There are 1 best solutions below

0
On

Here is a partial answer that just looks at corellations. We have: \begin{align} x_t &= e_t^2 + 2e_tC_{t-1} \\ z_t &= e_{t-1}^2 + 2e_{t-1}C_{t-2} \end{align} First note that $E[x_t]=E[z_t]=1$ for all $t \geq 2$. Now we show that $\{x_2, x_3, x_4, \ldots\}$ are pairwise uncorrelated. Fix $t>n\geq 2$. Then: $$ E[x_tx_n] = E[(e_t^2 + 2e_tC_{t-1})(e_n^2 + 2e_nC_{n-1})] = E[e_t^2e_n^2] = E[x_t]E[x_n] $$ and so these are indeed pairwise uncorellated. Likewise, $\{z_2, z_3, z_4, \ldots\}$ are pairwise uncorellated.

Notice that:
$$ x_tz_t = e_t^2e_{t-1}^2 + 2e_t^2e_{t-1}C_{t-2} + 2e_tC_{t-1}e_{t-1}^2 + 4e_te_{t-1}C_{t-1}C_{t-2} $$ So: $$ E[x_tz_t] = E[e_t^2]E[e_{t-1}^2] = 1 = E[x_t]E[z_t] $$ So $x_t$ and $z_t$ are in fact uncorrelated for each $t$. Now look at correlations between $x_tz_t$ and $x_nz_n$ for $t \neq n$. I think that they are uncorrelated whenever $|n-t|\geq 2$.


Pairwise uncorellated means the variance of the sum is the sum of the variances. The mean of $\overline{X}_T$ is 1 for all $T$, but its variance does not converge to zero as $T\rightarrow\infty$. But I think the variance of $\frac{1}{T^a}\overline{X}_T$ converges to 0 for any $a>0$. I suspect the end result holds because the variance of the numerator in $\rho_T$ grows much more slowly than the variance of the denominator as $T\rightarrow\infty$.