Why does the perceptron allow us to seperate our feature space in two convex half spaces?

25 Views Asked by At

The perceptron is w . x = w^T . x = 0

I understand that we use the dot product to understand if 2 vectors are aligned or not (the dot product is positive if 2 vectors point in the same direction, and negative if they do not).

However, where I get lost is the connection between the intuition behind the dot product, and the fact that the perceptron allows us to separate the feature space in two convex half spaces.

In this post, I read in an answer that the dot product means that "for any two distinct sets of input feature vectors in a vector space (say we are classifying if a leaf is healthy or not based on certain features of the leaf), we can have a weight vector, whose dot product with one input feature vector of the set of input vectors of a certain class (say leaf is healthy) is positive and with the other set is negative. In essence, we are using the weight vectors to split the hyper-plane into two distinctive sets."

However, why does the dot product lead to the intuition in the quoted text? In other words, how does the fact that the dot product can allow us to understand if 2 vectors are aligned or not lead us to separate data in 2 classes?

1

There are 1 best solutions below

0
On

Let $w$ be fixed. You have already done the learning stage and now are using it.

You have the question "is $w \cdot x \geq 0$?". For any test point $x$, you can ask that question and get yes/no. If the test point says yes it is aligned with $w$, you put in the set marked healthy leaf. Otherwise you put in the set marked unhealthy leaf. What matters right now is that you have a yes/no question you can ask for every $x$ in the feature space.

This happens for all possible test points in the entire feature space. So you know that if $x_0$ corresponded to the features of a healthy leaf, then it would be an element in the set $\{ x \in \mathbb{R}^n \mid w \cdot x \geq 0\}$. Call that set $H$. This is because you knew that $w \cdot x_0 \geq 0$ since it was healthy.

Similarly if it was for an unhealthy leaf you would know that it was in the set $\{ x \in \mathbb{R}^n \mid w \cdot x \lt 0\}$. Call that set $U$. These two disjoint sets cover the entire feature space. They are the two convex half spaces that separate. $H \bigcap U = \varnothing$. $H \bigcup U = \mathbb{R}^n$.

It didn't really matter that this was given by a dot product. That was to make the resulting sets convex half spaces rather than some other messier subsets of the feature space.

You need a $b$ offset too as in the linked question. You don't want the boundary between healthy and unhealthy to always pass through $x=0$.