How to prove the equivalence of these two optimization function?

183 Views Asked by At

How can I prove that, $X\in\mathbb{R}^{n\times p}$ and $y \in \mathbb{R}^n$: \begin{align} \min_{\beta\in\mathbb{R}^p}\|y-X\beta\|_2^2 \end{align} is equivalent of the problem, \begin{align*} \min_{v\in\mathbb{R}^n} \|y-v\|_2^2 \text{ subject to } X^Tv = 0 \end{align*}

What I have attempted so far is by letting $z = X\beta$, the original problem becomes

\begin{align*} \min_{\beta\in\mathbb{R}^p}\|y-z\|_2^2 \text{ subject to } z=X\beta \end{align*}

so we can therefore write the Lagrangian form of our original function as $\|y-z\|_2^2+v^T(X\beta-z)$

By KKT condition,

$$ \nabla L(\beta, z,v)_\beta=X^Tv=0\\ \nabla L(\beta, z,v)_z=-2(y-z)-v=0\\ z=y+\frac{v}{2}$$

I then achieve $$\min_{v}v^T(\frac{v}{4}+y), \text{subject to } X^Tv=0$$

Where wrong is my derivation here?

1

There are 1 best solutions below

2
On

In general, those two conditions are not equivalent. Take for instance $n=p$ and $X$ invertible. Then \begin{align} \min_{\beta\in\mathbb{R}^p}\|y-X\beta\|_2=0, \end{align} while \begin{align*} \min\{ \|y-v\|_2^2: {v\in\mathbb{R}^n}\text{ subject to } X^Tv = 0\}=\|y\|^2_2. \end{align*}