So in SVM when there is no separable hyperplane we introduce the soft margin hinge-loss $$ \min_{w,b} \|w\|_2^2 +c\sum_{i=1}^{n}\max\{0,1-y_i(w^Tx_i +b)\} $$ This is supposed to be equivalent to the optimization problem $$ \min_{w,b,\xi_i} \|w\|_2^2 +c\|\xi\|_2^2 \\ \xi_i\geq 0 \\ \xi_i\geq 1-y_i(w^Tx_i +b) $$
I'm trying to understand why these problems are equivalent. Let $\xi^2=\max\{0,1-y_i(w^Tx_i +b)\}$ then if $$ 1-y_i(w^Tx_i+b)\leq 0 \Rightarrow \xi_i^2=0 \Rightarrow \xi_i=0 \\ $$ leads to $$ 1-y_i(w^Tx_i+b)\leq\xi_i $$
But how do we get constraints $\xi_i\geq 0$? If we consider the case $$1-y_i(w^Tx_i+b) >0 \Rightarrow \xi_i^2 = 1-y_i(w^Tx_i+b)\Rightarrow \xi_i^2>0.$$ But that doesn't tell us anything about $\xi$. Any ideas?
$\xi_i\geq\max(0,q_i)$ is equivalent to $\xi_i\geq 0$ and $\xi_i\geq q_i$. This is the standard way to model a maximum of two things. When you put $\xi_i$ in a minimization objective it will be driven down to $q_i$ (when $q_i\geq 0$) or to $0$ (when $q_i\leq 0$), which is exactly what you need.
By the way, you should either put squares around each $\max$ or replace $\|\xi\|_2^2$ with $\|\xi\|_1$ to make both problems completely equivalent.