Hard SVM (distance between point and hyperplane)

304 Views Asked by At

While studying Hard-SVM topic in Shalev-Shwartz book I came across the following proof for the distance between point and hyperplane

$$\min\{\|\pmb x-\pmb v\|: \langle\pmb w,\pmb v\rangle + b = 0\}\\ \text{Taking }\ \pmb v = \pmb x\ - (\langle \pmb w, \pmb x \rangle +\ b)\pmb w\ \text{ we have that}\\ \langle\pmb w,\pmb v\rangle+\ b = \langle\pmb w,\pmb x\rangle-\ (\langle \pmb w, \pmb x \rangle\ +\ b)\|\pmb w\|^2\ +\ b = 0,\\ \text{and}\\ \|\pmb x-\pmb v\|=|\langle \pmb w,\pmb x \rangle+\ b|\|\pmb w\| = |\langle \pmb w,\pmb x\rangle\ +\ b|$$

Above is a proof for the distance between point $\pmb x$ and the hyperplane defined by $(\pmb w, b)$ where $\|\pmb w\|=1$ which is $|\langle \pmb w, \pmb x \rangle+b|$

I can derive the same proof by taking a point on the plane say $\pmb y$ and then taking a orthogonal projection of $\pmb x - \pmb y$ on the normal vector of the plane, but not able to understand the proof provided in the book. I would greatly appreciate if anyone can explain the above proof.

PS: I understand the first line in the proof points towards finding a point $\pmb v$ on the plane such that the distance between $\pmb x \ \text{and }\ \pmb v$ is minimized.

Thanks

1

There are 1 best solutions below

2
On

The proof you provided is not complete. It's only the first part of it.

The distance between a point $\textbf{x}$ and a hyperplane $H$ defined by $(\textbf{w},b)$ is defined by:

$$ d(\textbf{x},H) = \underset{\textbf{v} \in H}{\text{min }} \|\textbf{x}-\textbf{v}\|\ $$

That is, one is trying to find the point $\textbf{v}$ in the hyperplane that minimises the distance to the point $\textbf{x}$. The proof is done by taking any point $\textbf{u} \in H$ and showing that:

$$ \| \textbf{x}-\textbf{u} \| \geq \|\textbf{x}-\textbf{v}\| = |\langle \textbf{w}, \textbf{x} \rangle +b | $$ where $\textbf{v} = \textbf{x} - (\langle \textbf{x}, \textbf{w} \rangle +b) \textbf{w}$.

That is, the point $\textbf{v}$ that we constructed is the one that minimises the distance to the point $x$ and hence $\|\textbf{x} - \textbf{v}\|$ is the distance between the hyperplane and $x$.

Here I used the same notation in the book and skipped the calculus since it's provided in there. The construction of $v$ is based on addition of vectors. You can think of $\textbf{x}$ as a vector from the origin to the point $x$. Similarly, $\langle (\textbf{w}, \textbf{x} \rangle+b) \textbf{w} $ is the vector from the origin to the orthogonal projection on the plane. Hence, the distance we're looking for, i.e., the distance between $x$ and its orthogonal projection, is just the difference between these two vectors.