How was Ax + By + Cz + D = 0, an equation for a plane in 3D, "derived"?

5.1k Views Asked by At

To preface the question, with respect to the topic at hand, I am an absolute beginner. I've only very recently dived into a lesson in my book that deals as an intro of sorts to vectors and planes in 3D. Searches in regards to this question on the Internet, if any, only leave me more confused

I've thus far, learned about various ways to represent planes:

  1. With the normal vector of the plane + distance from the origin : $\vec a \bullet\hat n = d$
  2. With the normal vector of the plane + a point on the plane : $(\vec r - \vec a) \bullet \hat n = 0$
  3. With 3 non-collinear points : $(\vec r - \vec a).((\vec b - \vec a) \times (\vec c - \vec a)) = 0$

The fourth one thus far, the intercept form , broaches a new equation for a plane (before moving on to the rest of the derivation) , of which I've not seen before:

$Ax + By + Cz + D = 0$

I'd initially thought that this was a rearrangement of $(\vec r - \vec a) \bullet \hat n = 0$

That is, if $\vec r$ was some vector of the form $x \hat i + y \hat j + z \hat k$, and $\hat n$ was $A \hat i + B \hat j + C\hat k$ , it would yield $Ax + By + Cz - D = 0$, since $\vec a \bullet \hat n$ being the distance from the origin to the plane

However, the equation given in the derivation of the intercept form is $Ax + By + Cz + D = 0$, wherein we do not find a negative D

How exactly was $Ax + By + Cz + D = 0$ 'derived', so to speak. I can't seem to make sense of it or intuit it

5

There are 5 best solutions below

6
On BEST ANSWER
  1. With the normal vector of the plane + a point on the plane : $$(\vec r - \vec a) \bullet \hat n = 0$$

In this second representation, it is not necessary for the normal vector to be a unit normal, so we can write $$(\vec r - \vec a) \cdot \vec n = 0$$ instead. So, we have $$\begin{pmatrix} x-a_1 \cr y-a_2 \cr z-a_3 \end{pmatrix}\cdot \begin{pmatrix} n_1 \cr n_2 \cr n_3.\end{pmatrix}=0,$$ in other words, $$n_1x+n_2y+n_3z-n_1a_1-n_2a_2-n_3a_3=0.$$ Letting \begin{align}A&=n_1\\B&=n_2\\C&=n_3\\D&=-n_1a_1-n_2a_2-n_3a_3,\end{align} the equation becomes $$Ax + By + Cz + D = 0.$$

P.S. The term $$-D$$ could well be a positive number; read the minus sign in front of $D$ not as an adjective/description ("$\require{cancel} \xcancel{\text{negative}} D$") but as a verb/operation ("minus $D$").


Addendum

So $D$ here is merely the negative of the distance $D?$ If so, why precisely is it taken as the negative as opposed to the positive? As in $Ax+By+Cz−d=0$ where $d$ is the perpendicular distance to the origin along the normal? Did it come about as the result of an agreed upon convention?

  1. Read “-5” as “minus 5” instead of “negative 5”. If $D$ equals $(-7),$ then both $(-D)$ and $|D|$ are positive and equal $7.$ Prefixing a number with a minus sign flips its sign; on the other hand, taking the absolute value of a number converts any negative sign to positive.

  2. Your first way of representing planes contains a typo: the equation is $\vec r \cdot\vec{\hat n} = d$ instead or $\vec a \cdot\vec{\hat n} = d.$ Here, $\hat n$ has unit length, and the plane's distance from the origin is $|d|$ instead of $d.$

    Similarly, in your second way of representing planes, the plane's distance from the origin is actually $\frac{|D|}{\sqrt{A^2+B^2+C^2}}$ (the denominator equals $1$ if we've been using a unit normal).

    To be clear: while the plane's distance from the origin is naturally nonnegative, $d$ and $D$ may be negative or zero or positive.

    For example, the planes $x+2y+2z+9=0$ and $x+2y+2z-9=0$ are both $3$ units from the origin.

Why exactly is $\frac D{\sqrt{A^2+B^2+C^2}}$ taken to be the negative of $d?$ Why wasn't it taken to be the positive of $D?$ Is it just convention?

  1. Notice that "the positive of $(-3)$" can be reasonably interpreted both as +3 and -3; "the negative of $(-3)$" is just as confusing.

  2. I think you mean to ask why $D$ and $d$ have opposite signs. When you rewrite the plane's equation $$Ax + By + Cz + D = 0$$ as $$Ax + By + Cz = D_2$$ (neither equation is more conventional than the other), then $$\dfrac{D_2}{\sqrt{A^2+B^2+C^2}}=d=\dfrac{-D}{\sqrt{A^2+B^2+C^2}}.$$

3
On

It's a general case of the definition of a hyperplane $c \cdot x = d$ for $c, x \in \mathbb{R}^n, d \in \mathbb{R}$.

The intuition is that a plane has a vector "sticking out" of it (perpendicular to the vector between any two points on the plane). That is: a vector $c$ such that for any $x, y$ on the plane, $c \cdot (x-y) = 0$. This requires that for any point $z$ on the plane, $c \cdot z$ is required to be constant.

0
On

It might also be worth realizing that the way these different ways of representing the plane OP mentions haven't been "derived" from each other in any real meaningful sense of the word. What you get are equivalent definitions of the same object. Where both "same" and "object" are doing quite a bit of heavy lifting. $\mathbb{R}^3$ can be viewed through many lenses and other than as a cartesian products of sets is not particularly well defined.

Note that when you talk about a plane being defined by a normal vector and a distance from the origin you're usually working with two different $\mathbb{R}^3$'s one is a set of points and another is a set of vectors. On the other hand if you're looking at $Ax+By+Cz+D=0$, you're usually thinking of just the points and the vector is there only as a sort of emanation in the coefficients. While it's perfectly possibly to transform between these views and most people will just do it completely automatically, since they are all isomorphic in a canonical way, it can create some confusion for beginners.

A different way of thinking of the $Ax+By+Cz+D=0$ equation for a plane (and the oldest by far I believe) is from an algebraic point of view. Assuming $ABC\neq 0$ you have one linear equation of three variables giving you one constraint on your subspace, thus defining a plane.

This approach works the same in higher dimensions, where one non degenerate linear equation gives you a hyperplane (a space of one lower dimension).

3
On

From your question and comments I assume you are looking mostly at some sort of didactic approach to $Ax+By+Cz=D$, or how to visualize it. As other answers have provided, it is an equivalent form to the first vector formula you have provided via relatively trivial transformation.

To the visual/intuitive approach, if you have never heard of vectors:

  • Imagine the formula for a one-dimensional line in 2D space, through the origin: $f(x)=Ax$ or, to use a similar form as yours, $y=Ax$. This is your prototype of what it means to be "linear". One value is directly proportional to another (you can transform to $A=y/x$), which is immediately related to the "incline" or gradient of the line. All these terms are naively equivalent in this area, and the intuitive meaning is "a larger value of $A$ means that for a given movement along the $x$ axis you end up higher on the $y$ axis). Note that this formula can readily be transformed to the general 1-line in 2D: $Ax+By=C$, with $B=-1$ and $C=0$.
  • The next escalation is moving this line away from the origin: $y=Ax+C$. It is quickly seen that $C$ means "the place where the line intersects the $y$ axis". Obviously the "form" of the graph is still a line, or linear, since the $Ax$ term stays the same, and the gradient also does not change. The line is simply shifted. The full form is $Ax+By=C'$ where $B=-1$ and $C'=-C$. Note that this formula cannot represent a vertical line.
  • The next step is going to the full 1-line in 2D: $Ax+By+C=0$ with arbitrary (positive, negative or 0 coefficients). Resolving this to the previous form ends up with $y=-(A/B)x-C/B$. So for all $B\ne0$ there is nothing new. Intuitively we check what happens if $B=0$ and end up with $Ax+C=0$ or $x=-C/A$, which without much ado intuitively gives us a vertical line which cuts the $x$ axis at $-C/A$. If you wish to think more in terms of the $f()=...$ form, you can take a formula $f(y)=-C/A$ and just turn your graph paper by $90°$ (or swap the labels on your $x$ and $y$ axis), but this is not that useful.

All of this just shows that you can intuitively work with the $A, B, C$ constants by playing around a bit, and by setting some of them to $0$ individually to see what they actually do. For the 2-line (or plane) in 3D, you can continue the same:

  • You can start with a formula like $f(x,y)=Ax+By+D$, which is equivalent to $z=Ax+By+D$ or $Ax+By+Cz=D'$ (with $C=0$ and $D'=-D$). Then imagine standing somewhere on the plane and looking up or down; by knowing your "2D position" $(x,y)$ you immediately find the plane above/below/through. By setting any individual constant of $A, B$ to 0, or fixing the quotient $A/B$ or $B/A$ to an arbitrary value, it is very easy to see that everything behaves in a "planar" fashion again (e.g., $f(x,y)=D$ is just the "floor" plane shifted up or down; $f(x,y)=Ax+D$ is the "floor" plane tilted along the $y$ axis and intersecting the $z$ axis at $z=D$, and so on and so forth.
  • As in the case of the line in 2D, it is relatively clear (if you just imagine these things in the real world, intuitively) that you can represent almost all planes generally by rotating the "floor" plane around $x$ and $y$ arbitrarily, and then shifting it along $z$.
  • Again, we would be happy and finished if we never wanted to represent a plane intersecting the "floor" plane at $90°$. This is not possible because we cannot represent a rotation to $90°$ with the previous form - we're working with gradients here, not angles. For this, we finally move to the full representation of $Ax+By+Cz=D$. With the proper transformations, we can transform any plane representation where $C\ne0$ to an equivalent form $A'x+C'z=D'$ or $B''y+C''z=D''$, which geometrically is equivalent to just flipping the space by $90°$ along some axis, or by re-labeling the axes - the same as we did in the case of the vertical line, before.

Note that "visual" here literally means to stop worrying too much with the formula, but to draw the situation with a pen at every step; thus it quickly becomes intuitive.

2
On

This is not a full answer but, in my opinion, it provides good intuition.

One linear equation with unknowns in a 3D space ($n=3$) pins down a linear subspace of dimension $(n-1)$, that is, a plane passing through the origin.

Adding a constant ($D$) makes it an affine equation, which allows the plane to be translated away from the origin.