General Prediction Theory

43 Views Asked by At

Suppose that we have random variables $Y, X_1, \ldots, X_{p-1}$. Let $X = (X_1, X_2, \ldots, X_{p-1})^\top$. To predict $Y$ from the values of $X$, we use $f(X)$ a predictor of $Y$ such that the mean squared error is minimized, i.e., $$\min_f E [Y - f(X)]^2.$$

Theorem: Let $\varrho (X) = E[Y|X].$ Then for any other predictor $f(X)$, if $$E[Y - \varrho (X)]^2 \leq E[Y - f(X)]^2$$ then $\varrho(X)$ is the best predictor of $Y.$

For predicting $Y = (Y_1, \ldots, Y_q)'$ from $(X_1, \ldots, X_{p-1})$. The predictor $f(X)$ is best if the the scalar is minimized i.e., $$\min E{[Y - f(X)]^\top [Y - f(X)]}.$$ With vector $Y$, what modifications do I have to make in order of the above theorem to hold? I need suggestions from learned members.

1

There are 1 best solutions below

4
On

If $Y$ is a vector with $q$ components then the predictor $f(X)$ must be a vector of $q$ component predictors, which you can write as $$ f(X):=(f_1(X),f_2(X),\ldots, f_q(X)).\tag1 $$ Now use the fact that $v^\top v=\sum_\limits{i=1}^q (v_i)^2$ for any vector $v:=(v_1,\ldots,v_q)$ to write $$ [Y-f(X)]^\top[Y-f(X)] = \sum_\limits{i=1}^q [Y_i-f_i(X)]^2\tag2 $$ so the expectation of (2) is $$ E[Y-f(X)]^\top[Y-f(X)] = \sum_\limits{i=1}^q E\left([Y_i-f_i(X)]^2\right).\tag3 $$ Your goal is to minimize (3) over all $(f_1,\ldots,f_q)$. Since (3) is a sum of $q$ components, where component $i$ of the sum involves $Y_i$, $(X_1,\ldots,X_{p-1})$, and $f_i$, you can minimize separately each component of the sum, using the Theorem.