In William Greene's Econometric Analysis book (7th edition), page 64 states (right above equation 4-20) assumption A5a: $$\tag{A5a} (\textbf{x}_i,\epsilon_i)\quad i=1,\dots,n \quad \text{ is a sequence of independent observations} $$ (note that $x_i$ is a vector of either stochastic or non-stochastic objects)
My question is the following: What exactly does this mean (both intuitively and in mathematical terms)
My understanding is that this means (in intuition)
- each $\textbf{x}_i$ needs to be independent of $\textbf{x_j}$ (i.e. $P(\textbf{x}_i \cap \textbf{x}_j) = P(\textbf{x}_i)P(\textbf{x}_j)$
- each $\epsilon_i$ must be independent of each $\epsilon_j$
In terms of of math, if $f_i$ denotes the joint distribution of $(\textbf{x}_i,\epsilon_i)$, and $f$ denotes the joint distribution of all the pairs, then I think it requires $$ f((\textbf{x}_1,\epsilon_1),\dots,(\textbf{x}_n,\epsilon_n)) = \Pi f_i(\textbf{x}_1,\epsilon_1)\dots f_n(\textbf{x}_n,\epsilon_n) $$
This condition does not require anything about how $\textbf{x}_i$ and $\epsilon_i$ relate, though, does it?
Edit: Note, I am only asking about what assumption A5a implies(/requires), not about what might be needed for least squares or some other estimation technique (which likely requires additional assumptions)
Assuming the random variables are absolutely continuous, your characterization that the joint density is the product of the marginals, i.e.
$$f(x_1,\epsilon_1,...,x_n,\epsilon_n)=\Pi_i f(x_i,\epsilon_i)$$
is correct.
Note that random vectors $X,Y$ being independent implies $g(X),h(Y)$ are also independent for measurable real-valued functions $g,h$. Applying functions that marginalize out each component in your setup, it follows that $x_i$ is independent of $x_j$, and also $\epsilon_i$ is independent of $\epsilon_j$ for $j\neq i$.