$\forall i \in \{1\ldots N\}$, let $x_i$ denotes a pint in $\mathbb{R}^2$ and $y_i\in \{-1,+1\}$ where $y_i=+1$ implies $x_i\in$ class A and $y_i=-1$ implies $x_i\in$ class B.
Let $ L=\left\{x\in \mathbb{R}^2|x^\top \beta+\beta_0=0 \right\}$, where $\beta=(\beta_1,\beta_2)\in \mathbb{R}^2$ and $\beta_0\in \mathbb{R}$.
Here the aim is to find an optimal separating hyperplane which separates both the class and maximizes the distance to the closest point from either class.
Consider the following formulation:
A: \begin{equation}\label{convex_form1} \begin{aligned} & \underset{\beta, \beta_0}{\text{maximize}} & & M\\ & \text{subject to} & & \dfrac{1}{||\beta||}y_i(\beta^\top x_i+\beta_0)\geq M,~\;\; \forall x_i\in X,~ y_i\in Y. \end{aligned} \end{equation}
B: \begin{equation}\label{convex_form2} \begin{aligned} & \underset{\beta, \beta_0}{\text{minimize}} & & ||\beta||\\ & \text{subject to} & & y_i(\beta^\top x_i+\beta_0)\geq 1,~\;\; \forall x_i\in X,~ y_i\in Y. \end{aligned} \end{equation}
C: \begin{equation}\label{convex_form3} \begin{aligned} & \underset{\beta, \beta_0}{\text{minimize}} & & \dfrac{1}{2}||\beta||^2\\ & \text{subject to} & & y_i(\beta^\top x_i+\beta_0)\geq 1,~\;\; \forall x_i\in X,~ y_i\in Y. \end{aligned} \end{equation}
Q1. How are formulations A, B, and C equivalent?
What I know: Formulation A, comes from the idea that, we want to find an optimal separating hyperplane that is at least $M$ units away from all the points. Choosing $M=\dfrac{1}{||\beta||}$ gives Formulation B. But why choose such a $M=\dfrac{1}{||\beta||}$ other than the reason that it results in Convex Optimization problem.