Full rank assumption and $\mathbb{E}(\mathbf x_i^T \mathbf x_i)$

83 Views Asked by At

This is a rudimentary question, I suppose, but I could not find it anywhere...

The question is, in a multivariate regression model, $$ y_i = \mathbf x_i' \mathbf \beta + \epsilon_i$$ where subscript $i$ for $i$-th observation and $\mathbf x_i' = (x_{i1}, \dots, x_{ik})$, $\mathbf \beta = (\beta_1, \dots, \beta_k)$.

We know that in order to get the estimator $$\hat{\beta} = \left(\sum_{i=1}^n \mathbf x_i \mathbf x_i' \right)^{-1} \left(\sum_{i=1}^n \mathbf x_i \mathbf y_i \right)$$ We require the matrix $\sum_{i=1}^n \mathbf x_i \mathbf x_i'$ be invertible, i.e. the full rank assumption.

The question is, I also see in some literature the full rank assumption is replaced by $0 < \mathbb{E}(\mathbf x_i^T \mathbf x_i) < \infty$.

My puzzle is here - I could not see the connection between the two.

Are they interchangeable? If yes, how does one imply the other?

Could somebody help? Thanks.

2

There are 2 best solutions below

6
On BEST ANSWER

It's not a replacement but the context is different. First, you need $\sum_ix_ix_i'$ to be invertible so that the estimator is well-defined. This is an algebraic requirement.

Now, sometimes, people weaken the above and instead suppose that $E(x_ix_i')$ in invertible. Then, there are usually some additional assumptions that allow you to infer $\frac{1}{n}\sum_ix_ix_i'$ converges in probability to $E(x_ix_i')$. This means there is a subsequence of $\{\frac{1}{n}\sum_ix_ix_i'\}$ that converges almost surely to $E(x_ix_i')$ which has determinant $>0$. So there is some $n$ sufficiently large that $\frac{1}{n}\sum_ix_ix_i'$ has positive determinant with probability $1$. In turn, for such $n$, $\sum_ix_ix_i'$ has positive determinant (i.e. is invertible) with probability $1$.

Do you see the connection now?

0
On

The replacement makes no sense. For one thing, in the traditional regression formulation, there's no point taking the expectation of expressions that don't involve $y$ since the other quantities are constants rather than random variables. Second, the condition seems to simply be stating that each observation vector of $x$'s is not identically zero. That does guarantee full rank. It just guarantees rank at least one.

You might want to look back at the source and see if it says what you think it says. Because you're right to think it makes no sense.