Is an AR(1) process with Bernoulli errors mixing or ergodic?

174 Views Asked by At

Before the $\text{AR}(1)$ model, first look at a simpler example

$$y_t=\rho^t y_0+\epsilon_t$$

where $0<\rho<1$ and $\epsilon_t\overset{\text{i.i.d.}}{\sim} \text{Bernoulli} \left(\frac{1}{2} \right)$ i.e. $\epsilon_t$ has $50\%$ chance of being $1$ and $50\%$ chance of being $0$. Similarly, $y_0$ also follows the distribution $\text{Bernoulli} \left(\frac{1}{2}\right)$.For finite $t$, $y_t$ has 4 outcomes with 25% chance for each: $0$, $1$, $\rho^t$ and $\rho^t+1$ and

\begin{align} P(y_t=1,y_0=1) &= P(y_0=1)P(y_t=1|y_0=1) \\ &= \frac{1}{2} \times 0 \\ &= 0 \\ &\neq P(y_0=1)P(y_t=1) \\ &= \frac{1}{2}\times\frac{1}{4} \\ &= \frac{1}{8} \end{align}

Obviously, $y_t$ converges to $\epsilon_t$ almost surely (since $P(\underset{t\to\infty}{\lim}\rho^ty_0=0)=1$), which is independent of $y_0$, and $y_t-\epsilon_t=O_{a.s.}(\rho^t)$. It seems true to me that

$$\lim_{t\to\infty}[P(y_t=1,y_0=1)-P(y_0=1)P(y_t=1)]=\frac{1}{4}-\frac{1}{4}=0$$

and

$$\lim_{t\to\infty}[P(y_t=1|y_0=1)-P(y_t=1)]=0$$

Hence the sequence looks mixing (strong and uniform). I am not sure about the size of mixing and the rate of convergence i.e

$$P(y_t=1,y_0=1) - P(y_0=1)P(y_t=1) = \mathcal{O}(?)$$


Now, an $\text{AR}(1)$ model with Bernoulli errors takes the form below:

$$y_t=\rho y_{t-1}+\epsilon_t$$

where $y_0=0$ and $\epsilon_t\overset{\text{i.i.d.}}{\sim}\text{Bernoulli} \left(\frac{1}{2}\right)$. For this model, one can have

$$y_{t+n}=\rho^n y_{t}+\sum_{i=1}^{n}\rho^{n-i}\epsilon_{t+i}$$

Similar to the first example, this sequence also looks uniform/strong mixing, though I am not sure how to show it. Could anyone shed some light for me?

2

There are 2 best solutions below

0
On

The sequence is ergodic, but not strong mixing at leasts for $\rho \in (0, 1/2]$.

Ergodicity

To see it's ergodic, note that any strongly stationary causal solution of the $AR(1)$ equation $$x_t = \rho x_{t-1} + \epsilon_t$$ where $\epsilon_t \sim \mathrm{IID}(0, \sigma^2)$ is ergodic. To see this, we note that $$x_t = \sum_{k=0}^\infty \rho^k \epsilon_{t-k} = f(\epsilon_t, \epsilon_{t-1, \ldots})$$ is the measurable image of an $L^2$ IID (hence ergodic) sequence; since ergodicity is preserved under measurable maps (cf. Billingsley Theorem 36.4), $x_t$ is ergodic.

In your example, to get to your desired form, write your errors as $\gamma_t = \frac{1}{2} + \epsilon_t$ where $\epsilon_t \sim \mathrm{Unif}(\{\pm 1/2\})$, set $\mu = \frac{1}{2(1-\rho)}$, and let $x_t = y_t - \mu$. Then, you get that $x_t$ satisfies the $AR(1)$ dynamics $x_t = \rho x_{t-1} + \gamma_t$, implying it (and thus $y$) is ergodic.

Mixing

That $y_t$ is not strongly mixing for $\rho \in (0, 1/2]$ is precisely the content of this paper. They conjecture that the restriction on $\rho$ may not be necessary, but I don't know what happens for $\rho > 1/2$.


References

Billingsley. 1995. Probability and Measure.

Andrews, Donald WK. "Non-strong mixing autoregressive processes." Journal of Applied Probability 21.4 (1984): 930-934.

3
On

Many thanks for your answers, Jose. I have looked at the Andrews' paper. It does help me understand the related concepts better.

But I still have some questions related to strong mixing, weak mixing and ergodicity. In Andrews' paper, he defined $A=\{X_t|X_t\in (0,\rho)\}$ and $B_s=\{X_{t+s}|X_{t+s}\in \overset{2^s}{\underset{j=1}{\cup}}(w_j,w_j+\rho^{s+1})\}$, where $w_j$s are the possible values of $\sum_{l=0}^{s-1}\rho^l\epsilon_{t+s-l}$: $0,\rho^{s-1},\rho^{s-2},\dots,\rho,\dots,1,\dots,1+\rho+\dots+\rho^{s-1}$ ($2^s$ outcomes in total, the same probability for each outcome). It is clear that $\mu(B_s)=(2\rho)^s\rho$, where $\mu(\cdot)$ is the Lebesgue measure. If $\rho>\frac{1}{2}$, $\mu(B_s)$ will increase with $s$. But if $\rho$ gets too large, $P(A)$ will get small. If $0<\rho\leq\frac{1}{2}$, one can have $\mu(B_s)\leq\rho$ for any $s\geq 1$. Note that $X_{t+s} \in [0,\frac{1}{1-\rho})$ with $\frac{1}{1-\rho}>\rho$. Therefore, $P(B_s)<1$ and $P(A)(P(B_s|A)-P(B_s))=P(A)(1-P(B_s))>0$ for any s, which means an AR(1) model with Bernoulli errors is not strong mixing when $\rho\in (0,\frac{1}{2}]$. It is neither weak mixing since $\underset{n\to\infty}{\lim}\frac{1}{n}\sum_{s=1}^{n}|P(A)||P(B_s|A)-P(B_s)|\neq 0$. For example, suppose $t$ is sufficiently large. When $\rho=\frac{1}{2}$, $X_t\sim U[0,2)$ (the possible outcomes are $0,\frac{1}{2^{t-1}},\frac{2}{2^{t-1}},\dots,\frac{2^t-1}{2^{t-1}}=2-(\frac{1}{2})^{t-1})$. For any $s$, $X_{t+s}$ is also uniformly distributed between 0 and 2 and hence $P(B_s)=\frac{1}{4}$ with $\underset{n\to\infty}{\lim}\frac{1}{n}\sum_{s=1}^{n}|P(A)||P(B_s|A)-P(B_s)|=\frac{3}{16}$. To my confusion, on P.931 of Andrews' paper, he said "All $L^2$ AR(1) processes are ordinary mixing see Hannan (1970), Chapter IV, Theorem 3". I looked at Hannan's theorem. It seems to me that Hannan's definition of mixing, (2.1) on P.202, is strong mixing.

Jose, I am not sure what you mean by ergodic. Do you mean $\frac{\sum_{t=1}^T X_t}{T}$ converges in probability to $E(X_t)$? When I looked at this paper (P.2), ergodicity means $\underset{n\to\infty}{\lim}\frac{1}{n}\sum_{s=1}^{n}P(A)(P(B_s|A)-P(B_s))=0$. Obviously, for $\rho=\frac{1}{2}$, the AR(1) model is not ergodic either based on this definition.

For the first example: $y_t=\rho^t y_0+\epsilon_t$. One can construct $A=\{y_0|y_0=1\}$ and $B_t=\{y_t|y_t\in\{\rho^t,1+\rho^t\}\}$ to have $P(B_t|A)=1$. It does not appear to be strong mixing since $P(A)(P(B_t|A)-P(B_t))=\frac{1}{4}$ for any t. It is not possible to find $T(v)$ given any small $v>0$ such that $|P(A)(P(B_t|A)-P(B_t))|<v$ when $t>T(v)$. Hence the sequence in the first example is neither strong/weak mixing, nor ergodic. Please correct me if I am wrong. It seems different people talk about the same terms differently in this literature.

References

HANNAN, E. J. (1970) Multiple Time Series. Wiley, New York