Does quadratic risk of MLE for multivariate linear regression go to zero with more and more data?

245 Views Asked by At

For the simple multivariate linear regression with Gaussian noise: $\mathbf{Y} = \mathbf{X} \boldsymbol{\beta} + \boldsymbol{\epsilon}$, where

  • $\mathbf{Y} \in \mathbb{R}^n$: the vector of dependent variables,
  • $\mathbf{X} \in \mathbb{R}^{n \times p}$: each row is a vector of covariates,
  • $\boldsymbol{\epsilon} \in \mathbb{R}^n$: Gaussian noise $\boldsymbol{\epsilon} \sim \mathcal{N}\big(0, \sigma^2 I_n\big)$ for some constant $\sigma > 0$,

the MLE estimator of $\boldsymbol{\beta}$ is simply the least square estimator which is $\hat{\boldsymbol{\beta}} = \big(\mathbf{X}^{T} \mathbf{X} \big)^{-1} \mathbf{X}^{T} \mathbf{Y}$.

It is easy to compute the quadratic risk of the estimator: $$\mathbb{E}\big[||\hat{\boldsymbol{\beta}} - \boldsymbol{\beta}||_2^2\big] = \sigma^2 \mathrm{tr}\Big(\big(\mathbf{X}^{T} \mathbf{X} \big)^{-1}\Big).$$

My question: does this expression imply that the risk goes to zero as $n$ goes to infinity (i.e., we have more and more data)?

This requires $\lim_{n \to \infty} \mathrm{tr}\Big(\big(\mathbf{X}^{T} \mathbf{X} \big)^{-1}\Big) = 0$, which seems to be "trivial" when $p = 1$.

1

There are 1 best solutions below

0
On BEST ANSWER

Note that \begin{align*} \mathrm{tr}\Big(\big(\mathbf{X}^{T} \mathbf{X} \big)^{-1}\Big) &= \frac{1}{n}\mathrm{tr}\Big(\big(\frac{1}{n}\mathbf{X}^{T} \mathbf{X} \big)^{-1}\Big). \end{align*} Now, in regression analysis, there is usually an assumption that guarantees that $$\frac{1}{n}\mathbf{X}^{T} \mathbf{X} \to \Sigma, \text{ as $n \to \infty$},\tag{1}$$ where $\Sigma \in \mathbb R^{p\times p}$ some symmetric positive definite matrix. If the $X_i$ are i.i.d. you get this from the law of large numbers. If the $X_i$ are deterministic then the assumption is often simply made as I stated it.

Hence assume that $(1)$ holds. Then $\mathrm{tr}\Big(\big(\frac{1}{n}\mathbf{X}^{T} \mathbf{X} \big)^{-1}\Big) \to \mathrm{tr}(\Sigma^{-1}), \text{ as $n\to \infty$}$ and therefore $$\frac{1}{n}\mathrm{tr}\Big(\big(\frac{1}{n}\mathbf{X}^{T} \mathbf{X} \big)^{-1}\Big) \to 0, \text{ as $n\to \infty$}.$$