Let's say we have matrix $x \in \mathbb{R}^{n \times k}$, $y \in \mathbb{R}^n$ and $\beta^*$ vector, which $\beta^* = \arg\min_\phi\sum_i (y_i - x_i\phi)$, i.e. we have classic regression problem and $\beta^*$ is $OLS(y, x)$ solution. Final model represented as $y = x\beta^* + \varepsilon$, where $\varepsilon -$vector of residuals with unknown parameters and distribution, parameters we want to estimate.
Using bootstrap, we can construct vector $\varepsilon^b$ from vector $\varepsilon$, having then vector $y^b$ as $y^b = x\beta^* + \varepsilon^b$. Then we estimate $\beta^b = OLS(y^b, x)$. Doing this an infinite number of times, we get accurate bootstrapped distribution of random vector $\varepsilon^b$.
Now, having the knowledge that we included intercept term (i.e. column of ones in x matrix, turning it into matrix $x \in \mathbb{R}^{n \times (k+1)}$), we need to prove that $\mathbb{E}\beta^b = \beta^*$ and covariance matrix $cov(\beta^b) = \frac{\sum_i (y_i - x_i\beta^*)^2}{n}(x^Tx)^{-1}$.
My approach: knowing that we have column ones and fact that residual vector $\varepsilon$ is orthogonal to each column of $x$, get $\varepsilon \cdot 1 = 0 \Rightarrow \bar{\varepsilon} = 0$. Using infinity and LLN we get $\mathbb{E}\varepsilon = 0$. From normal equation $\beta^b = (x^Tx)^{-1}x^Ty^b = \beta^* + (x^Tx)^{-1}x^T\varepsilon^b$, i.e. $\mathbb{E}\beta^b = \beta^* + (x^Tx)^{-1}x^T\mathbb{E}\varepsilon^b = \beta^*$. Though I have troubles with second fact.
Question: how we can prove second? How to show that $\varepsilon^b$ has variance $\frac{\sum_i (y_i - x_i\beta^*)^2}{n}$?
First, $$ \beta^b=\beta^*+(x^{\top}x)^{-1}x^{\top}\varepsilon^b. $$ Now, for $i=1,\ldots,n$, $$ \mathsf{E}[\varepsilon_i^b\mid y, x]=\frac{1}{n}\sum_{j=1}^n \varepsilon_j^*=0, $$ $\operatorname{Cov}(\varepsilon_i^b,\varepsilon_j^b\mid y, x)=0$ for $i\ne j$, and $$ \operatorname{Var}(\varepsilon_i^b\mid y, x)=\frac{1}{n}\sum_{j=1}^n (\varepsilon_j^{*})^2. $$ Thus, $$ \mathsf{E}[\beta^b\mid y, x]=\beta^*, $$ and $$ \operatorname{Var}(\beta^b\mid y, x)=(x^{\top}x)^{-1}x^{\top}\operatorname{Var}(\varepsilon^b\mid y, x)x(x^{\top}x)^{-1}=(x^{\top}x)^{-1}\frac{1}{n}\sum_{j=1}^n (\varepsilon_j^{*})^2. $$