regularized least squares Generalized Tikhonov Regularization on real dataset

279 Views Asked by Bumbble Comm At 27 Mar 2026 - 6:35

I am using regularized least squares more specifically Generalized Tikhonov Regularization on real dataset where rows << cols:

$$=(A^TA+\lambda I)^{-1}(A^Tb)$$

I am implementing it using C by invoking LAPACK routines. For factoring and solving the system, I am using LU decomposition with partial pivoting by invoking DGESV.

I am trying to have different values for the regularize coefficient and each time I am calculating mean square error (MSE) for training set and for testing set.

Conceptually, as regularize coefficient $\lambda$ got smaller $\lambda \to 0$, MSE becomes small and close to zero. This means that the solution X is overfitting dataset.

I don't have such behavior. For example MSE for $\lambda=0.0001$ and $\lambda=0.0$ are the same and it is big ($MSE=0.05$ on the training dataset, and $MSE=0.07$ on the testing dataset).

Could anyone explain for me why I have the same MSE for different regularize coefficient $\lambda$? Could this because of the nonlinearty of dataset?

Original Q&A

There are 1 best solutions below

Bumbble Comm On 21 May 2019 - 3:51

Well, first I would notice that, although I don't know the magnitude of the data in your dataset, the two $\lambda$s are very close indeed, and if you think at the closed form of expression of the Ridge estimator doesn't seem so surprising that they induce the almost identical MSE: there are possibly some decimals missing here.

Additionally, note that it is not necessarily true that the smaller $\lambda$, the smaller the test MSE. The choice of the optimal $\lambda$, indeed, relies in the Bias - Variance Trade-Off: the higher $\lambda$, the more the Bias and the smaller the Variance. It is totally possible, as such, that up to a certain point, increasing $\lambda$ induces a reduction in the Variance - via the shrinkage of the Ridge coefficients - that overtakes the increase in the Bias, hence reducing the test MSE which, keep in mind, is an increasing function in both the Variance and the Bias squared.

regularized least squares Generalized Tikhonov Regularization on real dataset

There are 1 best solutions below

Related Questions in LINEAR-ALGEBRA

Related Questions in STATISTICS

Related Questions in MACHINE-LEARNING

Trending Questions

Popular # Hahtags

Popular Questions