$y$ is a random variable, and $x$ is a $(k+1)\times 1$ random vector: $$ x=(1,x_1,\ldots,x_k)'. $$ I recall that a nonrandom $1\times(k+1)$ vector $\beta$ that minimizes $E[(y-\beta x)^2]$ is given by $$ \hat\beta=E(yx')E(xx')^{-1}.\tag{$*$} $$ Then I test ($*$) by thinking of the case when $y=\alpha$ for some nonrandom $\alpha$. Clearly, the optimal $\beta$ should be $(\alpha,0,\ldots,0)$ but that's not what ($*$) gives me. What am I missing please?
Clarification: At some point I think ($*$) is wrong somehow but for all $\beta$ \begin{align} E[(y-\beta x)^2]&=E[((\hat\beta-\beta)x+(y-\hat\beta x))^2]\\ &=E[((\hat\beta-\beta)x)^2]+E[(y-\hat\beta x)^2]+2(\hat\beta-\beta)E[x(y-\hat\beta x)]. \end{align} Since $$ E[(y-\hat\beta x)x']=E(yx')-E(yx')E(xx')^{-1}E(xx')=0\implies E[x(y-\hat\beta x)]=0 $$ so, $\forall\beta$, $$ E[(y-\beta x)^2]= E[((\hat\beta-\beta)x)^2]+E[(y-\hat\beta x)^2]\geq E[(y-\hat\beta x)^2] $$ so ($*$) actually seems alright. Frustratingly, I know these are supposedly to be standard results so why doesn't the simple example above work?
Answer: the toy example does not fail!
Let $z=\begin{pmatrix}z_1 & \cdots & z_k\end{pmatrix}'$ so that $x=\begin{pmatrix}1 \\ z\end{pmatrix}$. Then \begin{align*} E(\alpha x')E(x x')^{-1}&=\alpha\begin{pmatrix}1 & E(z')\end{pmatrix}\begin{pmatrix}1 & E(z')\\ E(z) & E(zz')\end{pmatrix}\\ &=\alpha\begin{pmatrix}1 & E(z')\end{pmatrix}\begin{pmatrix}1+E(z')[\text{Var}(z)]^{-1}E(z)&-E(z')[\text{Var}(z)]^{-1}\\ -[\text{Var}(z)]^{-1}E(z) & [\text{Var}(z)]^{-1}\end{pmatrix}\\ &=\alpha\begin{pmatrix}1 \\ 0_{k\times 1}\end{pmatrix} \end{align*} which is your expected answer.