How does the probabilistic interpretation of least squares for linear regression works?

532 Views Asked by Bumbble Comm At 22 Feb 2026 - 5:55

Let us assume that the target variables and the inputs are related via the equation:

$y^{(i)}=w^Tx^{(i)} + e^{(i)}$

where $e^{(i)}$ is an error term that captures either unmodeled effects, or random noise. Let us further assume that the $e^{(i)}$ are distributed IID according to a Gaussian distribution with mean zero and some variance $σ^{2}$. Thus:

$p(e^{(i)})=\frac{1}{\sqrt{2π}σ}exp(-\frac{(e^{(i)})^2}{2σ^2})$

My question is, how does this imply that the probability of $y^{(i)}$ given $x^{(i)}$ and parameterized by $w$ is the following:

$p(y^{(i)}|x^{(i)}, w)=\frac{1}{\sqrt{2π}σ}exp(-\frac{(y^{(i)} - w^Tx^{(i)})^2}{2σ^2})$

Original Q&A

There are 1 best solutions below

Bumbble Comm On 18 Jan 2018 - 9:21

You are viewing $w_j$, $i=j,..,p$ as unknown constants and $\mathrm{x}_i$ is given, thus denote $(X_1 = x_1,..,X_p = x_p) = X$, so $$ \mathbb{E}[y_i|X] = \mathbb{E}[w^T\mathrm{x}_i + e_i|X]= w^T\mathrm{x}_i + \mathbb{E}[e_i|X] =w^T\mathrm{x}_i. $$ Same for the variance, i.e., $$ \operatorname{Var}[y_i|X] = \operatorname{Var}[w^T\mathrm{x}_i + e_i|X]= \operatorname{Var}[e_i|X] =\sigma^2. $$ And $y_i$ given $X$ is a linear compilation of a normal r.v. $e_i$, thus $$ y_i|X \sim \mathcal{N}(w^T\mathrm{x}_i, \sigma^2). $$

How does the probabilistic interpretation of least squares for linear regression works?

There are 1 best solutions below

Related Questions in REGRESSION

Related Questions in LEAST-SQUARES

Related Questions in LINEAR-REGRESSION

Trending Questions

Popular # Hahtags

Popular Questions