How to formalize Elements of Statistical Learning's computation for the EPE of linear regression

39 Views Asked by At

Let $X\in \mathbb R^d,Y\in \mathbb R$ be random variables. The expected prediction error of a predictor $f$ is

$$ \mathbb E[(f(X) - Y)]^2. $$

Suppose the true distribution of data satisfies

  • There is $\beta\in\mathbb R^d, \sigma^2\in\mathbb R_{>0}$ such that $Y\sim \mathcal N(\beta^TX, \sigma^2)$.
  • $\mathbb E[X] = 0$.

Suppose we draw a sample of $N$ points from this distribution, and let $f$ be the predictor formed by linear regression.

Equation 2.28 of Elements of Statistical Learning computes the expected prediction error of $f$ as approximately being

$$ EPE(f)\approx \sigma^2(d/N) + \sigma^2. $$

Is there a way to make this more formal?

It is only approximate because it uses $\mathbf{X}^T\mathbf{X}/N\to Cov(X)$ as $N\to\infty$, where $\mathbf{X}$ is the $N\times d$ data matrix to conclude

$$ \mathbb E [x_0^T (\mathbf{X}^T\mathbf{X})^{-1} x_0] \approx \mathbb E [x_0^T Cov(X)^{-1} x_0]/N, $$

where $x_0$ is another observation of $X$ independent of $\mathbf{X}$.