The formulation of the SVM optimization problem is:
\begin{equation} \begin{aligned} & max_{w,b} \frac{1}{||w||} \\ & \text{ subject to } \\ & y_i(w^{T}x_i+b) \geq 1 \end{aligned} \end{equation}
What I do not understand is why do we use $w^Tx_i+b=1$ or $-1$ in the setup. My question is specifically about why 1? I understand that $w^Tx_i+b$ is the equation of a hyperplane and multiplying it by binary class labels $y_i \in \{-1,1\}$ we get the inequality but why do we initially not use $w^Tx_i+b = 2$ or 0 or any number. I am assuming we can adjust for this number since we have b as a hyperparameter.
Thank you!
P.S.:I doubt that this question was not asked before within another question but I cannot find it anywhere
It is not particularly important that the right hand side is one, but it is important that the right hand side is bounded by two different numbers (such as $1$ and $-1$) in order to induce a margin between the clusters. As you note, "we can adjust for this number". For instance, $w^Tx_i + b = 2$ is equivalent to $[\frac 12 w]^T x_i + \frac 12 b = 1$ and to $w^Tx_i + (b - 1) = 1$.