Decision surface in linear classification

Question

Decision surface in linear classification

564 Views Asked by Bumbble Comm At 01 Apr 2026 - 11:04

I have several question regarding the following definition of a linear hyperplane for classification:

We define our classifier $F$ as follows:

$$F(x) = \text{sign}(\langle w,x\rangle +b) \in \{1,-1\}$$

where

$$\text{sign}(z) = \begin{cases} 1&z \geq 0\\ -1&z < 0 \end{cases}$$

My questions:

How comes the threshold can be assumed to be $0$? Do we have to set constraints on weight vector $w$ and bias $b$ s.t. this is fulfilled? I don't see why for different problem sets it would not be any number $\in R$.

As far as I know, $\langle w,x\rangle+b =0$ defines a plane only if $w$ is a normal vector...again, how comes that $w$ is normal? Is this again a constraint we set during optimization?

In our lecture notes, it is mentioned that $|\langle w,x\rangle+b|$ is the distance of the vector $x$ from the hyperplane $\langle w,x\rangle+b = 0$. How so?

Many thanks

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Answer 1 · 2017-03-20 23:27:42

Because of the bias term. The $w$ forms the orientation of the hyperplane (it determines the normal vector to the plane), while the bias $b$ forms the "position" of the plane. Consider a simple case in 1D: every point greater than 10 is +1; others are -1. Then the bias simply moves the threshold to 10. In every (linearly separable) case, one can let the threshold be 0, and let the $b$ make up for it. (Alternatively, one can change the threshold and force $b$ to be 0, but it would be the same).
It's the other way around; look at the definition of hyperplane. Any equation $\vec{w}\cdot\vec{x}={b}$ forms a hyperplane. The classical form for a plane is $\vec{n}\cdot(\vec{x}-\vec{b})=0$, where $\vec{n}$ is the normal to the plane and $\vec{b}$ is the intercept. So $w$ determines the normal, by definition.
This only seems to be correct if $w$ is a unit vector. The unsigned distance from a point $p$ to a plane $w\cdot x+b=0$ is (see e.g. here): $$ d(p) = \frac{|w\cdot p + b|}{||w||_2} $$ This is not assumed in general (though it sometimes is). See here for instance.

Also, consider looking at descriptions of the perceptron, which is what you are concerned with.

Decision surface in linear classification

There are 1 best solutions below

Related Questions in GEOMETRY

Related Questions in STATISTICS

Related Questions in MACHINE-LEARNING

Trending Questions

Popular # Hahtags

Popular Questions