In a linear regression minimization problem, why does replacing the expectation by the empirical distributions lead to OLS?

243 Views Asked by Bumbble Comm At 08 Apr 2026 - 9:07

The linear regression problem can be written in general by first considering random variables $X \in \mathbb{R}^p$ and $Y \in \mathbb{R}$. Then, the linear regression problem can be described as looking for:

$$ \arg\min_{\beta \in \mathbb{R}^p} \mathbb{E}\left[(Y-X\beta)^2\right] $$

which has a closed form solution given by:

$$ \widehat{\beta} = \mathbb{E}[X^TX]^{-1}\mathbb{E}[X^TY] $$

I read in a talk that if we replace the expectation $\mathbb{E}$ with respect to the distribution of $(X,Y)$, by the empirical distributions $\frac{1}{n}\sum_{i=1}^{n}\delta_{Y_i}$ and $\frac{1}{n}\sum_{i=1}^{n}\delta_{X_i}$, corresponding to our dataset, then we retrieve the Ordinary Least Squares regression problem.

Can someone tell me why this is the case?

Original Q&A

There are 1 best solutions below

Bumbble Comm On 09 May 2017 - 9:35 BEST ANSWER

The empirical MSE is $$ \frac{1}{n} \sum_{i=1}^n( y_i - \widehat{\mathbb{E}y_i} )^n = \frac{1}{n} \sum_{i=1}^n( y_i - \hat{y}_i )^2, $$ such that $\hat{y}_i = \hat{\beta}_0 + \sum_{j=1}^p\hat{\beta}_j x_j$. As such, the problem is finding the $\hat{\beta}$s that minimize the empirical MSE. Now, note that you can multiple the target function by $n$, it is monotone transformation, thus does not change the solution. Hence, you get the following problem $$ \arg \min_{\beta \in \mathbb{R}^{p+1}} \sum_{i=1}^n (y_i-\beta_0 - \sum_{j=1}^p\beta_j x_j)^2, $$ which is exactly the OLS problem.

In a linear regression minimization problem, why does replacing the expectation by the empirical distributions lead to OLS?

There are 1 best solutions below

Related Questions in PROBABILITY

Related Questions in PROBABILITY-THEORY

Related Questions in STATISTICS

Related Questions in STATISTICAL-INFERENCE

Trending Questions

Popular # Hahtags

Popular Questions