Proving $a$ (in $Y = aX + b + e$) satisfies $a = Cov(X, Y )/Var(X)$

882 Views Asked by At

In a linear regression model, we postulate that random variables $X$ and $Y$ are related by

$$Y = aX + b + e$$

where a and b are constants (called the regression coefficients) and e (representing random error) is a random variable independent of $X$ such that $E(e) = 0$.

Show that the coefficient $a$ satisfies $a = Cov(X, Y )/Var(X)$.

1

There are 1 best solutions below

0
On

1) Assume that $(X,Y)$ follow bivariate normal distribution, thus $$ \mathbb E[ Y |X=x] =\mu_1+ \rho \frac{\sigma_Y}{\sigma_X}(x-\mu_X) = \mu_Y - \mu_X\rho \frac{\sigma_Y}{\sigma_X} + \frac{\rho\sigma_Y\sigma_X}{\sigma^2 _X}x= \beta_0 +\beta_1x \, , $$ note that $cov(X,Y)=\rho \sigma_X \sigma_Y$, hence, $\beta_1 = cov(X,Y)/var(X)$.

2) Assume that $Y=a+bx+\epsilon$, such that $\mathbb E\epsilon=0 $ and $\mathbb E\epsilon^2=\sigma^2$, you can show that the least square estimator of $b$ is given by $$ \hat b_n= \frac{\sum(Y-\bar{Y}_n)(x_i - \bar{x})}{\sum(x_i - \bar{x})^2}, $$ that can be viewd as $\frac{\sum(Y-\bar{Y}_n)(x_i - \bar{x})/n}{\sum(x_i - \bar{x})^2/n}=\frac{\widehat{cov}(X,Y)}{\widehat{var}(X,Y)}$, that is consistent estimator of $cov(X,Y)/var(X)$, i.e., $$ \lim_{n\to \infty}P\left(\left|\hat b_n - \frac{cov(X,Y)}{var(X)}\right|>\epsilon\right)=0. $$