Given a vector $y \in \mathbb R^n$ and real constants $x_{ij}$ ($i=1,\dots,n$, $j=1,\dots,p$), we consider a vector $\beta = (\beta_0,\dots,\beta_p)$ which minimize $$\sum_{i=1}^n\Bigl(y_i-\beta_0-\sum_{j=1}^p \beta_jx_{ij} \Bigr)^2 + \lambda\sum_{j=1}^p\beta_i^2.$$ Here $\lambda$ is a nonnegative parameter. (Thus $\beta$ depends on $\lambda$.)
It seems to be known that finding $\beta$ is equivalent to the following problem: $$\text{minimize}\ \ \sum_{i=1}^n\Bigl(y_i-\beta_0-\sum_{j=1}^p \beta_jx_{ij} \Bigr)^2 \quad\text{subject to}\quad \sum_{j=1}^p\beta_i^2 \leq s.$$
How can we prove the equivalence?
Notes
- I am NOT asking how to find $\beta$.
- As notation suggests, I found this statement in a context of ridge regression.