How is this a coherent definition of "hyperplane"?

801 Views Asked by At

My textbook says the following:

A hyperplane is a set of the form

$$\{ x \mid a^T x = b \},$$

where $a \in \mathbb{R}^n$, $a \not= 0$, and $b \in \mathbb{R}$. Analytically it is the solution set of a nontrivial linear equation among the components of $x$ (and hence an affine set). Geometrically, the hyperplane $\{ x \mid a^T x = b \}$ can be interpreted as the set of points with a constant inner product to a given vector $a$, or as a hyperplane with normal vector $a$; the constant $b \in \mathbb{R}$ determines the offset of the hyperplane from the origin. This geoemtric interpretation can be understood by expressing the hyperplane in the form

$$\{ x \mid a^T(x - x_0) = 0 \},$$

where $x_0$ is any point in the hyperplane (i.e., any point that satisfies $a^T x_0 = b$).

I find the coherency of the hyperplane definition $\{ x \mid a^T(x - x_0) = 0 \}$ to be questionable. We are looking to define the hyperplane in the first place, which the author defines as $\{ x \mid a^T(x - x_0) = 0 \}$; but this definition has imbedded in it that $x_0$ is any point in the hyperplane. So in the process of constructing a full definition of "hyperplane", the author explicitly imbeds in the as-of-yet incomplete definition of "hyperplane" a term that is defined in terms of the as-of-yet incomplete definition, which therefore presumes that we already have a full definition of the hyperplane. In other words, the "definition" presumes that we have already defined what a hyperplane is. It seems to me that this is incoherent, since, it seems to me, a mathematical definition (indeed, any definition) cannot be self-referential in this way?

I would appreciate it if people could please take the time to clarify this.

EDIT:

The way I see it, there are two possibilities: (1) I am misunderstanding $\{ x \mid a^T(x - x_0) = 0 \}$, and it is actually a special case (a specific type of) $\{ x \mid a^T x = b \}$, or (2) both $\{ x \mid a^T x = b \}$ and $\{ x \mid a^T(x - x_0) = 0 \}$ are equivalent, which means that they must both be definitions of hyperplane, which means that the second is an incoherent "definition" because it is self-referential and cannot stand on its own. I'm honestly not sure and would appreciate help.

EDIT2:

I just did some sketching of hyperplanes in the ambient space $\mathbb{R}^2$ to get a better idea of the role of $x_0$ in hyperplanes. We first begin by selecting any (random) points $a$ and $x_0$ in the ambient space $\mathbb{R}^2$. After this step, it seems that the geometry of the hyperplane is "locked-in"; in other words, the selection of $a$ and $x_0$ determines the hyperplane. We then, as the definition $\{ x \mid a^T(x - x_0) = 0 \}$ suggests, find the point $x$ such that $x - x_0$ is orthogonal to $a$. The set of all $x$ such that $x - x_0$ is orthogonal to $a$ is a line, and this line is the hyperplane, which can also be expressed equivalently as $\{ x \mid a^T x = b \}$.

2

There are 2 best solutions below

0
On BEST ANSWER

The two definitions of a hyperplane can be written in the following format:

Definition 1: A hyperplane in $\mathbb R^n$ is a subset which can be written in the form $\{x \mid a^T x = b\}$, for some $a \in \mathbb R^n$ such that $a \ne 0$, and for some $b \in \mathbb R$.

Definition 2: A hyperplane in $\mathbb R^n$ is a set which can be written in the form $\{x \mid a^T(x-x_0) = 0\}$ for some $a \in \mathbb R^n$ such that $a \ne 0$, and some $x_0 \in \mathbb R^n$.

These definitions are equivalent.

To see why definition 1 implies definition 2, if a set $H$ satisfies definition 1 as witnessed by $a,b$, then $H$ is not empty, as one can prove by linear algebra. Pick any $x_0 \in H$. It follows that $H$ also satisfies definition 2 as witnessed by $a,x_0$.

Conversely, if $H$ satisfies definition 2 as witnessed by $a,x_0$, it also satisfies definition 1 as witnessed by $a,b$ where $b = a^T x_0$.

To summarize, the "coherence" issue that you bring up is a valid issue, but is settled by demonstrating that any $H$ that satisfies Definition 1 is not empty.

0
On

In both cases you need two things to specify a hyperplane: a normal vector, and a point on the hyperplane. In the first case, the special point on the hyperplane is not made explicit, but is hidden in the affine translation $b$. Thinking by analogy to a line in a plane, recall that a line can be described in slope-intercept form, i.e. by an equation of the form $$ y = ax + b, $$ where $a$ is the slope and $b$ is the $y$-intercept. In particular, this line passes through the point $(0,b)$ in the plane. This point is not made explict, but is part of this definition of a line. Rearranging things a bit, this becomes $$ \langle -a, 1 \rangle \cdot \langle x, y \rangle = b, $$ which is of the form $\mathbf{a}^T \mathbf{x} = b$.

Lines may also be described by a point-slope equation, i.e. $$ y - y_0 = a(x-x_0), $$ where $(x_0, y_0)$ is a point that the line passes through, and $a$ is the slope of the line. In a more vector-y notation, this can be written as $$ a(x-x_0) + (-1)(y-y_0) = \langle a, -1 \rangle \cdot \langle x-x_0, y-y_0 \rangle = 0, $$ which is of the form $\mathbf{a}(\mathbf{x} - \mathbf{x}_0) = 0$. In this description, a point on the line is made explicit.

Note that any line in point-slope form can be expressed in slope-intercept form, and vice versa: $$ y - y_0 = a(x-x_0) \implies y = ax + \underbrace{(y_0 - ax_0)}_{=b}, $$ and $$ y = ax + b \implies y - b = a(x-0). \tag{$(x_0,y_0) = (0,b)$}$$ We could therefore define a line via either description. They are completely equivalent. The definition of a hyperplane can be seen as a generalization of this same idea.