Assumption of Normal Distribution

Question

Assumption of Normal Distribution

60 Views Asked by Bumbble Comm At 01 Apr 2026 - 12:11

I have a problem and I do not know when it is crucial and when it is NOT crucial to assume a normal distribution regarding linear regression, for estimates, t-tests, f-tests, confidence intervals and prediction intervals.

say we have $$ Y_1 = \beta_0 + \beta_1 \cdot X_1 $$

I know that we assume that the errors are normal distributed and that it is crucial. Otherwise Confidence and prediction intervals will be more or less wrong. For our estimates ($beta_i's)$ it should be the same, as $\beta_i´s$ are linear combinations of the $Y_i 's$ which are normal distributed therefor beta's are normal distributed. So it is also crucial for the tests as well as the estimates.

Thank you in advance for any kind of help!

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

For something like a $t$-test, it is not always all that crucial that the data distributions are normal because the true assumption is that the estimators for the two sample means are normal, which will be approximately true if you have enough data points and the true data distributions aren't too crazy (this is the central limit theorem). Similarly, for a linear regression, the predicted output variable values are linear combinations of the inputs, and so as long as your error model for inputs has mean 0 and isn't too crazy, and you have many input variables, and the largest order of magnitude learned coefficients for the regression are plentiful, then again the central limit theorem will say that the error distribution for output variable will be fairly normal.

There are cases, however, where normality is crucial and cannot be somewhat ensured by the central limit theorem. One example is if you want to estimate confidence intervals for a variable using just given mean and standard deviation. Technically, the only guaranteed confidence intervals you can get from this are from Chebyshev's inequality, which gives much wider confidence intervals than you get from assuming a normal distribution. However, in the worst case, depending on the variable's distribution, the confidence intervals from Chebyshev might be the best you can do. (i.e., Chebyshev inequalities are tight in the worst case)

Assumption of Normal Distribution

There are 1 best solutions below

Related Questions in REGRESSION

Trending Questions

Popular # Hahtags

Popular Questions