Variance of random variable (for least squares) after doubling dataset

18 Views Asked by At

This pertains to the variance of the random variable derived from least squares $$ \hat{\beta} = (X^TX)^{-1}X^TY $$

Note that $X \in R^{n \times p}$ is fixed and full rank and $Y \in R^{n \times 1}$ is random. Let's consider a case where $var(Y) = \sigma^2 I_{n \times n}$.

$$ \text{var}(\hat{\beta}) = \text{var}((X^TX)^{-1}X^TY) \\ = (X^TX)^{-1}X\text{var}(Y)X^T(X^TX)^{-1} \\ = (X^TX)^{-1}X\sigma^2 I X^T(X^TX)^{-1} \\ = \sigma^2(X_s^TX_s)^{-1} $$

So we derived this generically for some $X$ and $Y$ under the assumption that $var(Y) = \sigma^2 I_{n \times n}$.

Now let's consider a particular case where $X = X_s, Y = Y_s, var(Y_s) = \sigma_s^2 I$, we have $$ \hat{\beta_s} =(X_s^TX_s)^{-1}X_s^TY_s \\ \text{var}(\hat{\beta_s})= \sigma_s^2(X_s^TX_s)^{-1} $$

Now let's suppose we duplicate the above dataset, i.e., we have

$$ X_d = [X_s \ \ X_s]^T \\ Y_d = [Y_s \ \ Y_s]^T \\ $$

Then we have $$ \hat{\beta_d} = (X_d^TX_d)^{-1}X_d^TY_d \\ = (2X_s^TX_s)^{-1}2X_s^TY_s = \hat{\beta}_s $$

So the $\hat{\beta}$ for both is the same. Now for variance, we have

$$ var(\hat{\beta}_d) = \sigma_d^2 (X_d^TX_d)^{-1} \\ = \sigma_d^2 (2X_s^TX_s)^{-1} = \sigma_d^2 \frac{1}{2}(X_s^TX_s)^{-1} $$

My question is, how is $\sigma_d^2$ related to $\sigma_s^2$? Are they identical?