Bayes interpretation of regularization in linear regression

138 Views Asked by Bumbble Comm At 11 May 2026 - 2:42

I am deriving L2 regularization by considering Bayes theorem. In doing so I came across the following article which stated that the probability of a parameter theta has a probability distribution that is normal with mean 0. I would like to ask why such an assumption is made when a uniform distribution seems more natural?

Original Q&A

There are 1 best solutions below

Bumbble Comm On 14 Jul 2020 - 12:58 BEST ANSWER

One can say that the uniform distribution seems more natural as a prior distribution since it may seem better to assume that all possible values of a parameter are equally probable. However, that also means that you don't actually have any "prior" information about the parameter you want to estimate because "not having any information about something" actually means "all outcomes are equally probable". Moreover, if you use a uniform prior, the term $\log P(\theta)$ becomes constant and thus doesn't have any effect in the optimization step so you will have to get rid of it. Therefore, in this case, the maximum a posteriori estimate becomes equal to the MLE! and nothing new is actually done.

We use a distribution other than the uniform as a prior when we have some information about the parameters or we simply want to "impose" some distribution on them. In the case of L2 regression, we want our parameters to be "small" so we choose a normal distribution centered at zero which makes the values that are close to zero the most probable (in a normal distribution the mean is the most probable value), and thus we obtain smaller values for the parameters after estimation.

Bayes interpretation of regularization in linear regression

There are 1 best solutions below

Related Questions in STATISTICS

Related Questions in SUMMATION

Related Questions in MACHINE-LEARNING

Related Questions in REGULARIZATION

Trending Questions

Popular # Hahtags

Popular Questions