Machine Learning: Solving a Minimization Problem

72 Views Asked by At

In a Machine Learning course, we had the followin theorem:

Theorem 3.6: Let $A\in\mathbb R^{m\times n}$, $b\in \mathbb R^{m}$, $C\in \mathbb R^{p\times n}$ have independent rows and $d\in \text{im}\left(C\right)\subset\mathbb R^{p}$. Consider the minimization problem $$\min_{x\in\mathbb R^{n}}\left\vert\left\vert Ax-b\right\vert\right\vert_{2}^{2}\qquad \text{s.t.} \quad Cx = d. \qquad\qquad (1)$$

A vector $\hat{x}\in\mathbb R^{n}$ solves $(1)$ if and only if there exists $z\in\mathbb R^{p}$ s.t.

$$\begin{pmatrix} A^{T}A & C^{T} \\ C & 0 \end{pmatrix}\begin{pmatrix} \hat{x} \\ z \end{pmatrix} = \begin{pmatrix} A^{T}b \\ d \end{pmatrix}. \qquad\qquad (2)$$


As a homework, we're supposed to prove it. I already tried it myself, and I think I got "$\Rightarrow$" with the help of Lagrange multipliers.

Concerning the other direction "$\Leftarrow$": I am not relly sure how to get started, could anybody please provide a hint?