Regression How the assumption of independent noise relates to the real noise?

Question

Regression How the assumption of independent noise relates to the real noise?

119 Views Asked by Bumbble Comm At 27 Mar 2026 - 11:48

Let us first consider a ground-truth: $$y_i= \theta_0 x_i + e_i,$$ where $\theta_0 \not= 0$ is a constant and $e_i$ is an independent random noise over $i$. Suppose that I can obtain measurements of $y_i$ and $x_i$ over $i$.

Then to estimate the unknown parameter $\theta_0$, we typically consider a linear regression model of the above ground-truth: \begin{equation} y_i= \theta x_i + \epsilon_i \end{equation} where $\theta$ is a to-be-determined parameter and $\epsilon_i$ is assumed to be independent over $i$. Let us say that $\theta \in \Theta \subseteq \mathbb{R}$, which is the range where we search for an estimate $\hat{\theta}$ of $\theta_0$.

Now I am a bit confused by the above assumption of independence on $\epsilon_i$ in the regression model.

By this assumption, are we assuming that $\epsilon_i$ is independent over $i$ for any $\theta \in \Theta$?

Or are we assuming that: if $\theta_0 \in \Theta$, then there exists a $\theta = \theta_0$, such that $\epsilon_i=e_i$ leads to an independent varible over $i$?

Then what if the model I choose does not include the ground-truth (suppose I do not know the right model structure), e.g. consider the model $$y_i =\theta+ \epsilon_i,$$ where $\theta$ is a to-be-estimated parameter. Then can I still assume that $\epsilon_i$ is independent over $i$ in the above model? (This is probably a bad example but we can well have cases that the ground truth is not in the model set.)

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Answer 1 · 2020-05-23 04:54:02

I don't know what kind of notation you're using but its different nonetheless.

Typical theory books construct the model as such: $y=\beta_0+\beta_1x+\epsilon$

However, the regressional model assumes that the 'noise', or random error $\epsilon$, is distributed independently normal. Which means that for any two different observations they are independent; that is, the errors associated with $y_i$ have no correlation on $y_k$ for $i\ne k$

The value of $\theta$ has no influence on the independence of the errors, and visa versa. There are not some values of $\theta$ that make $\epsilon$ not independent.

In practice, it is hard to tell whether the errors are not independent or not, but it usually occurs in a time-series dataset; but you can check for normality by constructing a stem-and-leaf display or histogram of the residuals. If the errors are not distributed normally, you can perform a transformation to 'normalize' more.

Lastly, in the model you gave, $y=\theta+\epsilon$, I do not see a purpose in performing regression when there is no x given. In fact there are infinitely number of possibilities for $\theta$ if there is no x, as any combination of $\theta$+ $\epsilon$=$y$

Regression How the assumption of independent noise relates to the real noise?

There are 1 best solutions below

Related Questions in STATISTICS

Related Questions in REGRESSION

Related Questions in LINEAR-REGRESSION

Trending Questions

Popular # Hahtags

Popular Questions