Statistical error in the approximation-estimation tradeoff

Question

Statistical error in the approximation-estimation tradeoff

74 Views Asked by Bumbble Comm At 25 Mar 2026 - 11:35

Show that $$E(g_\tau ^G(X)-g^* (X))^2 = E(X^T \hat{\beta}-X^T\beta^G)^2+E(X^T\beta^G-g^*(X))^2$$

where $g_\tau ^G(X) = X^T \hat{\beta}$ and $g^G(X) = X^T \beta^G$ where G is a class of linear functions, $\beta$ is a parameter vector.

What I've done $$E(g_\tau ^G(X)-g^* (X))^2 = E(g_\tau ^G(X) - g^G(X) + g^G(X) - g^* (X))^2 = E(X^T \hat{\beta} - X^T\beta^G + X^T\beta^G - g^* (X))^2=E(X^T \hat{\beta}-X^T\beta^G)^2+E(X^T\beta^G-g^*(X))^2+2E[(X^T\hat{\beta} - X^T\beta^G)(X^T\beta ^G - g^*(X))]$$

What's left to show $$2E[(X^T\hat{\beta} - X^T\beta^G)(X^T\beta ^G - g^*(X))]=0$$ would solve the problem

Attempt but stuck $$2E[(X^T\hat{\beta} - X^T\beta^G)(X^T\beta ^G - g^*(X))]=2E[X^T\hat{\beta}X^T\beta ^G-X^T\hat{\beta}g^*(X) - X^T\beta^GX^T \beta^G+X^T\beta^Gg^*(X)]$$

From here I'm not sure how to go on. This might be the wrong way to solve the question.

Please comment if something is unclear. I'm just trying to learn!

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Answer 1 · 2020-06-28 13:32:26

I’ve been wondering about this one, too (I assume we’re reading the same book). After doing some searching online, I think the way to think about this is to consider the expectation as over two variables: a training set $\tau$ and a test point $(X,Y)$. So, $\mathbb{E}_{(X,Y),\tau}[ \cdots ] = \mathbb{E}_{(X,Y)}[\mathbb{E}_{\tau}[ \cdots ]]$. You can do this because the test point is independent of the training set, and so you can swap the inner and outer integrals. Then, you want to be able to say something like $\mathbb{E}_{\tau}[\hat{\beta}] = \beta^{\mathcal{G}}$, which would kill the inner integral, since everything in it but the $\hat{\beta}$ is fixed.

Now I just need to justify that to myself, which, you’d think, would have to do with the fact that we’re dealing with linear functions, since we haven’t really used that anywhere else.

Update:

Ok, I think I know what to do, based on more online reading. Note that $\hat{\beta}$ and $\beta^{\mathcal{G}}$ do not depend on the test point $(X,Y)$. So you can basically write $\mathbb{E}[(X^T \hat{\beta} - X^T \beta^{\mathcal{G}})(X^T \beta^{\mathcal{G}} - g^{*}(X))] = \mathbb{E}[(X^T \hat{\beta} - X^T \beta^{\mathcal{G}})(X^T \beta^{\mathcal{G}} - Y)]$. You can do this (I think), by noticing that we basically have $\mathbb{E}[c \cdot r(X)\cdot E[Y|X]]$, where $c$ represents constants and $r(X)$ represents stuff only depending on $X$. This is then $\mathbb{E}[\mathbb{E}[c \cdot r(X) \cdot Y| X]] = \mathbb{E}[c\cdot r(X)\cdot Y]$. Anyway, we then have $\mathbb{E}[(\hat{\beta}-\beta^{\mathcal{G}})^T X (X^T \beta^{\mathcal{G}}-Y)]$. The normal equations that define $\beta^{\mathcal{G}}$ give that $\mathbb{E}_{(X,Y)}[X(X^T\beta^{\mathcal{G}}-Y)] = 0$. So now use independence to break $\mathbb{E}_{\tau,(X,Y)}[\cdots] = \mathbb{E}_\tau[(\hat{\beta}-\beta^{\mathcal{G}})\mathbb{E}_{(X,Y)}[\cdots]]$, and the inner integral will go to 0.

Statistical error in the approximation-estimation tradeoff

There are 1 best solutions below

Related Questions in LINEAR-ALGEBRA

Related Questions in STATISTICS

Related Questions in EXPECTED-VALUE

Related Questions in TRANSPOSE

Trending Questions

Popular # Hahtags

Popular Questions