I am working my way through lecture notes from an econometrics course taught at Ohio State University on time series analysis. The current set of notes I am on (https://www.asc.ohio-state.edu/de-jong.8/note2.pdf) discusses ARMA models.
On page 2 the notes introduce the lag operator ($ Lx_t = x_{t-1}; L^{k}x_{t} = x_{t-k}$) and then gives the following definitions:
let
$$\epsilon_t = WN(0,\sigma^{2})$$
$$ \phi(L) = 1 - \phi_{1}L - \phi_{2}L^{2} - ... - \phi_{p}L^{p} $$
$$ \theta(L) = 1 + \theta_{1}L + \theta_{2}L^{2}...+\theta_{p}L^{q} $$
AR : $\phi(L)x_{t} = \epsilon_{t}$
MA : $x_{t} = \theta(L)\epsilon_{t}$
ARMA : $\phi(L)x_{t} = \theta(L)\epsilon_{t}$
I cannot see why the final expression follows, unless $x_{t} = 0$. An ARMA model is $x_{t} = AR(p) + MA(p)$. The terms in the above AR definition are negative, so adding AR to both sides would give $x_{t} + AR(P) = MA(P)$, but that isn't what the notes say. Can anyone with a bigger brain than me (plenty of you in here!) explain what I've missed?
There is already an answer to this question, but since I also had trouble understanding AR, MA, and ARMA models when I first learned about them, I want to share what helped me understand the models better:
Consider the equation $$\Psi(L)x_t = \Xi(L)\epsilon_t,$$ where $\Psi$ and $\Xi$ are functions of the linear operator $L : S^T\rightarrow S^T$ with $S^T$ denoting the set of all functions $T\rightarrow S$. Typical choices are $T = \mathbb N$ and $S = \mathbb R$, or $S = \mathbb R^d$.
Now $\Psi$ and $\Xi$ are functions of $L$. Let's consider some particular choices for $\Psi$ and $\Xi$:
There are, of course, other possible choices for $\Psi$ and $\Xi$, e. g. the operator exponential $\operatorname{Exp}$, an affine shift, a scaling, ...
However, the polynomial functions have some handy properties and thus are particularly interesting to study.