mathematics of backward shift operator

2.6k Views Asked by At

I am reading 'Spectral analysis and time-series' by M.B. Priestly. In chapter 3, the auto-regressive processes have been discussed. I am having difficulty with understanding the use of backshift operator. For example, a second order autoregressive process may be written as

$X_t+a_1X_{t-1}+a_2X_{t-2}=\epsilon_t$

where, $X_t$ is the stoahstic process and $\epsilon_t$ is the white noise. The above equation can be alternatively written as

$(1+a_1B+a_2B^2)X_t=\epsilon_t$

I understand that $BX_t$ and $B^2X_t$ means $X_{t-1}$ and $X_{t-2}$. Maurice, further, assumes that the equation $(1+a_1B+a_2B^2)$ has two solutions $\mu_1$ and $\mu_2$ and writes the above equation as

$X_t=\frac{1}{(1-\mu_1B)(1-\mu_2B)}\epsilon_t$

$X_t=\frac{1}{\mu_1-\mu_2}[\frac{\mu_1}{1-\mu_1B}-\frac{\mu_2}{1-\mu_2B}]\epsilon_t$

$X_t=\frac{1}{\mu_1-\mu_2}[\sum_{s=0}^{\infty}(\mu_1^{s+1}-\mu_2^{s+1})B^s]\epsilon_t$.

How are last three equations derived. Speciafically, I don't undertand why we can take the term $(1-\mu_1B)(1-\mu_2B)$ in denominator? Why $\frac{1}{1-\mu_1B}$ can be written as $\sum_{s=0}^{\infty}\mu_1^sB^s$ unless we know that $|\mu_1B|$ is less than 1?

I have not used about backshift operator before and read about it from wikipedia article only recently.

3

There are 3 best solutions below

1
On BEST ANSWER

This answer tries to shine some operator-theoretic light on the issue. I do make two key assumptions which can probably be verified by reading the text your are referencing.

Let's consider the operator $(1+a_1B+a_2B^2)$ if we (or Maurice) assume that there exist solutions $\mu_1,\mu_2$ to $a_2=\mu_1 \mu_2$ and $a_1 = -\mu_1 - \mu_2$, then we can write $$(1+a_1B+a_2B^2)=(1-\mu_1B)(1-\mu_2B).$$

If furthermore $\|\mu_i B\| < 1$ (this is an operator norm), then we get that $(I-\mu_i B)$ is an invertible operator and that $(I-\mu_i B)^{-1}=\sum_{k=1}^\infty (\mu_i B)^k$ This is the Neumann series, a generalization of the geometric series for operators. Writing this as a fraction is kind of a sloppy notation.

Furthermore, the first resolvent identity provides us with $(I-\mu_1 B)^{-1}(I-\mu_2 B)^{-1} = \frac{1}{\mu_1 - \mu_2}(\mu_1(1-\mu_1B)^{-1} - \mu_2(1-\mu_2B)^{-1}).$

To put it all together: If the $\mu_i$s exist and $\|\mu_i B\| < 1$ then $(I-\mu_iB)$ is invertible and we get from

$(1+a_1B+a_2B^2)X_t= (1-\mu_1B)(1-\mu_2B)X_t = \epsilon_t$ that

$$X_t = (I-\mu_1 B)^{-1}(I-\mu_2 B)^{-1}\epsilon_t = \frac{1}{\mu_1 - \mu_2}(\mu_1(1-\mu_1B)^{-1} - \mu_2(1-\mu_2B)^{-1})\epsilon_t = \frac{1}{\mu_1-\mu_2}[\sum_{s=0}^{\infty}(\mu_1^{s+1}-\mu_2^{s+1})B^s]\epsilon_t.$$

Unfortunately, I cannot provide proof for why $\|\mu_i B\| < 1$ (or equivalently $\frac{1}{\mu_i}\in \rho(B)$) since this depends on the choice/properties of your $a_i$s and is likely related to the stability mentioned in the other answer.

8
On

First, any quadratic equation has only two roots. So $$1+a_1 B + a_2 B^2 = (1-\mu_1 B)(1-\mu_2 B).$$ Second, most likely, the author is considering a stable system. The system where the impulse response dies. Note that the backward shift operation is related to the the delay operation (if you know about the Z transform, $z^{-1}$ is used to denote delay of one sample). In the Z plane, the system is stable if the poles (or the roots of the denominator) lie within unit circle. Consequently, $|\mu_1| < 1$ and $|\mu_2| < 1$. Hopefully, these points will address some of your concerns.

0
On

I stumbled upon this question looking for an answer for a student in a course I'm TA-ing now. I ended up writing a note that you can find in this link. I hope it helps somehow.