Regularization (Baysian approach with Map estimate)

104 Views Asked by At

As you know the regularization problem is as follows:

Ein(sample error) + $\lambda/n$ $||$$\theta$$||$. Thus, when $\lambda$ -> $\inf$, $||$$\theta$$||$ approches zero. However, Given that the prior distribution of the parameters p($\theta$) ~ N(0 , $b^2$ I) acting as regularizer/bias. the MAP estimate tells us the opposite as I understand. As $b^2$ or $\lambda$ approaches infinity - $\theta$ becomes uniformly distributed - which doesn't restrict our $\theta$ to be around the zero mean (unbiased choice of $b^2$), the regularization term in MAP estimate will diminish and this leaves us with the MLE estimate which we know it causes overfitting. However, in our case (where $b^2$ approaches infinity, we expect $||$$\theta$$||$ = 0. That means we encounter underfitting which contradicts that MLE causes overfitting! What am I missing here?

1

There are 1 best solutions below

0
On

For L2 regularisation, the prior distribution of the parameters is $P(\theta)$ ~ $N(0,\tau^{2})$, where $\tau=b$ from your question. As you can see in [1], this leads to $\lambda=(\frac{\sigma}{\tau})^{2}=(\frac{\sigma}{b})^{2}$ (where $\lambda$ is as defined by your formulation of the cost).

The inverse relationship of $\lambda$ and $b^{2}$ now fits into the correct inferences about MLE and MAP with over/underfitting through your arguments. $b^{2}$ is now instead the variance of the prior, which when approaching infinity would make $\lambda\to 0$ and the parameters $\theta$ uniformly distributed, leading to the MLE estimate, and vice versa for MAP.

[1] https://www.cs.cornell.edu/courses/cs4780/2018sp/lectures/lecturenote08.html