Is a hyper-plane uniquely defined by a single normal vector?

864 Views Asked by At

So I was always taught that planes have two directions that are normal, or perpendicular, to it. However, upon reading this comment, in dimensions higher than $R^3$ there can be more than one vector normal to a hyperplane.

This got me thinking, suppose our plane is $$ w + x + y + 0z = w + x + y = 0$$ Then our normal vector is $[1,1,1, 0]$. All vectors on the plane are normal to that vector. However, the same is true for $[1,1,1, 999]$ which points in a different direction, right(please tell me if this is actually pointing in the same direction and I'm an idiot)?

So essentially, we can't uniquely define a plane by a single vector(or negative one times that vector)? There are multiple vectors that can be orthogonal that plane, is this correct.

How come then in machine learning, our perceptron algorithm only uses a single theta(normal vector) to define the plane? Couldn't it be the case that a different normal vector could produce a different classification answer?

3

There are 3 best solutions below

0
On BEST ANSWER

No, you need another scalar to determine the "location" of the hyperplane – the (signed) distance from origin. However, if you limit to only hyperplanes through origin, then that scalar is always 0, and the (normalized) hyperplane normal vector uniquely (except for sign) determines such planes.

Also, ignoring a nonzero real scale factor $\lambda$ ($0 \ne \lambda \in \mathbb{R}$), there is exactly one vector $\mathbf{n}$ normal (perpendicular) to a hyperplane.

That is, if $\mathbf{n}$ is normal to the hyperplane, then all $\lambda \mathbf{n}$ are also normal to the hyperplane. Conversely, if both $\mathbf{a}$ and $\mathbf{b}$ are normal to a hyperplane, then $\mathbf{a} = \lambda \mathbf{b}$. Do note that $\lambda$ can be positive or negative.

You can also define a hyperplane parametrically using $N-1$ linearly independent vectors $\mathbf{b}_k$ ($k = 1, 2, \dots, N-1$) and optionally a local origin $\mathbf{o}$, as $$\mathbf{p} = \mathbf{o} + \sum_{k=1}^{N-1} y_k \mathbf{b}_k$$ where $y_k$ are the $N-1$ local coordinates on the hyperplane. But in this case, all $\mathbf{b}_k$ are perpendicular/normal to the hyperplane normal vector, i.e. $$\mathbf{b}_k \cdot \mathbf{n} = \sum_{i=1}^{N} b_{k,i} n_i = 0 ~ \text{for} ~ k = 1, 2, \dots, N-1$$

If those vectors $\mathbf{b}_k$ are nonzero, and they are all perpendicular to each other ($\mathbf{b}_k \cdot \mathbf{b}_j$ is nonzero only when $k = j$), then they form a basis for the hyperplane.


Let us define a hyperplane as an $N-1$-dimensional object in $N$-dimensional space, that divides that space into two separate subspaces. (This applies to affine hyperplanes and to vector hyperplanes, but not to projective hyperplanes.)

(In 1D, a hyperplane is a point, splitting the one-dimensional space into two parts. In 2D, a hyperplane is a line, infinitely long. In 3D, a hyperplane is a plane. And so on.)

The implicit equation for a hyperplane in $N$ dimensions in Cartesian coordinates is $$\mathbf{n} \cdot \mathbf{x} = \sum_{i=1}^{N} n_i x_i = d \tag{1}\label{1}$$ where $\mathbf{n} = (n_1, n_2, \dots, n_N)$ is the hyperplane normal coordinates, $\mathbf{x} = (x_1, x_2, \dots, x_N)$ is the coordinates for any point on the plane, and $d$ is the signed distance of the hyperplane from origin in units of the hyperplane normal length.

When (and only when) $d = 0$, does the hyperplane pass through origin.

Note that $\mathbf{n}$ must be a nonzero vector, $\sum_{i=1}^{N} n_i^2 \gt 0$. (In $\eqref{1}$, if $\mathbf{n}$ is a zero vector, the equation is either nowhere true (when $d \ne 0$) or everywhere true (when $d = 0$), i.e. either refers to the entire space or a nothing. Neither is a valid hyperplane.)

We can label the subspaces the hyperplane divides space into, via $$\mathbf{n} \cdot \mathbf{x} = \sum_{i=1}^{N} n_i x_i = \begin{cases} \gt d, \text{"Positive" subspace} \\ = d, \text{On the hyperplane} \\ \lt d, \text{"Negative" subspace} \\ \end{cases} \tag{2}\label{2}$$ where "Positive" and "Negative" are not standard labels, just names I picked for illustration. In fact, in different contexts even concepts like "inside" can be either one ($\gt d$ or $\lt d$; or, including the hyperplane, $\ge d$ or $\le d$), so it is important to describe how one chooses to label the two subspaces. Note that the hyperplane itself has zero "thickness", so it is not a subspace: its hypervolume is zero. (It is usually included in one of the subspaces.)

When hyperplanes are used as in $\eqref{2}$, they are called half-spaces. (A closed half-space, if it includes the hyperplane; an open half-space, if it excludes the hyperplane.) Convex polytopes with $K$ faces (noting that $K \gt N$) are defined by $K$ half-spaces corresponding to its faces.

Note how the hyperplane normal $\mathbf{n}$ alone does not uniquely define it; we also need the scalar, the signed distance from origin.

Because we can multiply both sides of $\eqref{1}$ by any nonzero real without affecting the hyperplane, the tuple $$\left(\frac{\mathbf{n}}{\sqrt{\sum_{i=1}^{N} n_i^2}} ; \frac{d}{\sqrt{\sum_{i=1}^{N} n_i^2}} \right) = \left( \frac{n_1}{\sqrt{\sum_{i=1}^{N} n_i^2}}, \frac{n_2}{\sqrt{\sum_{i=1}^{N} n_i^2}}, \dots, \frac{n_N}{\sqrt{\sum_{i=1}^{N} n_i^2}} ; \frac{d}{\sqrt{\sum_{i=1}^{N} n_i^2}} \right) \tag{4}\label{4}$$ uniquely defines a half-space. Scaling the equation by the Euclidean length of the hyperplane normal vector $\mathbf{n}$ effectively eliminates the magnitude of any scale factor out.

Negating both $\mathbf{n}$ (negating each normal vector component $n_i$) and $d$ divides the space into the same two subspaces, but selects the other half-space.

That is, we can negate $d$ and all $n_i$ in $\eqref{1}$ without changing the definition of the hyperplane. Therefore $\eqref{4}$ uniquely defines each hyperplane only if we consider tuples $\left(\frac{n_1}{\sqrt{\sum_{i=1}^{N} n_i^2}}, \frac{n_2}{\sqrt{\sum_{i=1}^{N} n_i^2}}, \dots, \frac{n_N}{\sqrt{\sum_{i=1}^{N} n_i^2}} ; \frac{d}{\sqrt{\sum_{i=1}^{N} n_i^2}}\right)$ and $\left(\frac{-n_1}{\sqrt{\sum_{i=1}^{N} n_i^2}}, \frac{-n_2}{\sqrt{\sum_{i=1}^{N} n_i^2}}, \dots, \frac{-n_N}{\sqrt{\sum_{i=1}^{N} n_i^2}} ; \frac{-d}{\sqrt{\sum_{i=1}^{N} n_i^2}}\right)$ the same.

Alternatively but equivalently, you can say that after scaling $\mathbf{n}$ and $d$ by $\frac{1}{\sqrt{\sum_{i=1}^{N} n_i^2}}$, i.e. having unit normal vector $\hat{\mathbf{n}}$ ($\left\lVert\hat{\mathbf{n}}\right\rVert = \sqrt{\sum_{i=1}^{N} \hat{n}_i^2} = 1$) and signed distance from origin $d$ for the hyperplane, tuples $\left(\hat{\mathbf{n}}; d\right)$ uniquely define the hyperplane, if you consider $\left(\hat{\mathbf{n}}; d\right)$ and $\left(-\hat{\mathbf{n}}; -d\right)$ the same.

Specifying unit normal vectors is useful in other ways as well. Not only is $d$ directly the signed distance from origin (instead of in units of normal length), but if $\hat{\mathbf{a}}$ and $\hat{\mathbf{b}}$ are two unit normals, the dihedral angle $\varphi$ between them fulfills $\cos\varphi = \hat{\mathbf{a}}\cdot\hat{\mathbf{b}} = \sum_{i=1}^{N} a_i b_i$.

4
On

The essential idea to understand is that of the hodge dual. It goes as follows: In a d dimensional space, there is a natural association between n dimensional object and n-d dimensional object. So, suppose we are working in 3D, there is a natural association between 1 dimensional object and 2 dimensional object. Here are list of all the correspondance in 3D space:

Scalars <--> Volumes

Vectors <--> Areas

We can also see in 2 dimensional space, we have a correspondance:

Scalars <--> Areas

Vectors <--> Vectors

Let me be more specific about association point. If we are to build up the higher dimensional object from a set of basis object , then the size of that basis for the n dim and n-d dim would be same.

Of course, the above idea only defines upto parallel shift. To get a specific hyper plane/ plane, you need to also specify the point on this object.

Hope this helps you.

0
On

Just believe in simple abstraction of a hyperplane defined as follows:

A hyperplane is a set of the form $H=\{x\in \mathbb{R}^n |a^\top x=b, a\in \mathbb{R}^n, b\in \mathbb{R}\}.$ Here, $a$ is called a normal vector to the hyperplane H with a unique direction $\hat{a}$.

Further, an Euclidean space being a Hilbert space, the normal vector $a$ introduce a representation of a linear function with respect to the dot product as an inner product in the space. So, as per Riesz representation theorem, we may call the normal vector $a$ a dual to the vector $x$.