Inconsistent notation for vectors and points in textbooks

754 Views Asked by At

Many books on calculus or advanced calculus distinguish between points and vectors. Usually points are denoted by italic letters like $P, Q$, and $R$, and vectors are denoted by bold letters such as $\mathbf{u}$ and $\mathbf{v}$. And some textbooks put the components of a vector between two angle brackets, while the coordinates of points are simply placed between two parentheses. However the notation is not consistant through the books. At least I have not seen any textbook that is consistent through the text. Here are two samples:

Sample 1: Marsden and Tromba in Vector Calculus

As you can see from the following figure, the point $P$ is not bold but vectors $\mathbf{v}$ and $\mathbf{w}$ are bold.

enter image description here

However, a few pages later, they denote points by bold letters enter image description here

So a student might ask: is $\mathbf{x}_0$ a point or a vector?

Sample 2: Stewart Calculus

The equation of a line passing through $P_0$ and parallel to a vector $\mathbf{v}$ is described by $$``\mathbf{r}_0=\mathbf{r}_0+t\mathbf{v},$$ where $\mathbf{r}$ is the position vector of a point $P(x,y,z)$ and $\mathbf{r}_0$ is the position vector of $P_0$." So clearly here, he does not add a vector $t\mathbf{v}$ to a point $P_0$. (I mean he could simply write the line is $\{P_0+t\mathbf{v}| t\in\mathbb{R}\}$). But when he talks about directional derivative, he adds a vector $\mathbf{u}$ to a point $\mathbf{x}_0$:

enter image description here

A student may ask: if $\mathbf{x}_0$ is a point why is it denoted by a bold letter? And what is the meaning of adding a vector to a point? Adding a vector to a point has not been defined in the textbook.

What is the best and consistent notation? What are the benefits of using angle brackets and parentheses for vectors and points? How do you avoid confusing students?

2

There are 2 best solutions below

4
On

Calculus textbooks seem to want to make a big distinction between vectors and points. I'm not sure how useful making that distinction is to students but it does seem to cause a lot of confusion. Here's how I think about it, which I hope is clear to my students. Having to use books like Stewart has made it challenging for me in the past to present a single view on points vs vectors.

What's the same: vectors and points in $\mathbb{R}^3$ have three bits of data. Each have an $x$, $y$, and $z$ coordinate. But what's different is what those coordinates mean. For a point we are talking about a position in space. For a vector $\langle a, b, c \rangle$ what we mean is something like "go $a$ units in the $x$-direction, $b$ in the $y$-direction, $c$ in the $z$-direction." This description of motion doesn't say from where we are going.

The basic operations are point + vector = point and vector + vector = vector. If we replace the word "vector" with displacement in these pseudo-equations, it says point + displacement = new point and displacement 1 + displacement 2 = total displacement. Here are two examples that I might use. (I'll switch to $\mathbb{R}^2$ now so it's easier to write.)

Example 1: "Starting at the point $(1, 2)$ go $3$ units left and $1$ up to get to the point $(-2,4)$." As an equation, this is $(1,2) + \langle -3, 1 \rangle = (-2, 4)$. The vector "$\langle -3, 1 \rangle$" is represented in words as "go $3$ units left and $1$ up".

Example 2: "If you go $3$ units left and $1$ unit up and then go $4$ units right and $2$ down, it's the same as going $1$ unit right and $1$ down." As an equation this is $\langle -3, 1 \rangle + \langle 4, -2 \rangle = \langle 1, -1 \rangle$. Each vector on the left represents a portion of the total displacement represented on the right.


For the books (I've used Stewart), I agree with you: I don't think most (any?) textbooks do a good job teaching about points vs vectors for exactly the reasons you've mentioned.

Here are my observations of how authors use notation (not rules for how you "have to" use notation but how authors do use it).

  1. Points can be written either as $P(a,b,c)$ with an un-bolded, but capital letter (usually, $P, Q, R,\dots$ and maybe $O$ for $(0,0,0)$) or a point can be written the same way as a vector (lower case and bold).

  2. When doing arithmetic, books never seem to want to add a point to a vector. That doesn't mean they don't, just that they usually butcher the explanation. For me, a line is a set of points, not a set of position vectors, so I would write $\{P + t\mathbf v : t \in \mathbb{R}\}$ with the understanding that the point $P$ plus the vector $t \mathbf v$ is a new point and the set of all those new points is the line.

I think the authors see a $+$ sign and think to themselves "well, I'm adding two things together and I said that we can't add two points together so I better convert everything to vectors." Of course, it makes perfect sense to add a vector to a point, but they never seem to want to do that. I think this leads to funny things like writing a line as a set of position vectors and writing points in lower case bold "vector-like" notation.

0
On

I agree with Trevor Gunn's answer, so I'll just add a few more points from a theoretical perspective.

To summarize:

  1. There is no problem considering the sum of a point and a vector (e.g., $P + v$) to be a point, or the difference of two points (e.g., $Q - P$) to be a vector.

  2. If an origin $O$ is selected and forever fixed, there is no difficulty considering points and vectors to be the same thing, and performing "vector-like" operations on points (e.g., $P + Q$ or $\lambda P$) as Apostol does.

  3. The operations in point 2 depend on the choice of origin, while the operations in point 1 do not. Difficulties may arise with the operations in point 2 if in a given problem more than one origin is considered. In this case, it becomes important to retain a conceptual distinction between points and vectors.

The theoretical distinction between "point" and "vector", from an advanced viewpoint, flows from the concept of an affine space, which is itself a special case of the concept of a set acted on by a group.

Without going into too much detail, a vector space $V$ consists of a set of "vectors" which can be added together or multiplied by scalars. There must be a zero vector, and the operations $+$ and $\cdot$ must satisfy certain axioms.

An affine space over a particular vector space $V$ consists of a nonempty set $A$ of "points" together with an operation $+$ between points and vectors such that:

  • To any point $P$ and vector $v$, the operation assigns some point $P + v$.

  • We always have $(P + v) + w = P + (v + w)$.

  • For any points $P$ and $Q$, there is a unique vector $v$ such that $P + v = Q$. This vector $v$ is denoted $\overrightarrow{PQ}$.

So in the correct theoretical framework, there is certainly no difficulty in adding a vector to a point (and it is even necessary).

Furthermore, if we look at the definition of $\overrightarrow{PQ}$ as being the unique vector which, added to $P$, gives $Q$, then we see that it is not unreasonable at all to write $\overrightarrow{PQ} = Q - P$. The notation $\overrightarrow{PQ}$ is actually nothing but an alternative to this.

Obviously, in practice we often identify vectors and points. We do this by singling out an origin $O$ in $A$. Once that has been done, the mappings

  • $A \to V, \quad P \mapsto \overrightarrow{OP}$,
  • $V \to A, \quad v \mapsto O + v$,

establish a one-to-one correspondence between points and vectors.

Thus if we always identify a point $P$ and the corresponding vector $\overrightarrow{OP}$, there is no problem defining on points all the same operations as on vectors. For instance, we can define $\lambda P$ to be the unique point $Q$ such that $\lambda \overrightarrow{OP} = \overrightarrow{OQ}$. Similarly, we can define $P + Q$ to mean the unique point $R$ such that $\overrightarrow{OP} + \overrightarrow{OQ} = \overrightarrow{OR}$.

The important thing to remember here is that the meaning of these "operations" on points changes if we select a different origin $O$. (That is, for instance, $P + Q$ will be a different point if we change the origin $O$ to a different origin $O'$.) If in the context of a given problem, we will only ever have one origin $O$, then no difficulties arise. I will assume this is the case below.

In particular, if a coordinate system, say $(x,y,z)$, has been chosen on $V$, then we can assign coordinates to points $P$ by the following rule: the coordinates of $P$ are those of the vector $\overrightarrow{OP}$.

In that case, given a point $P = (x,y,z)$ and a vector $v = \langle x', y', z' \rangle$, the coordinates of the point $P + v$ are $(x + x', y + y', z + z')$. Thus the rule for adding a point and a vector in coordinates is the same as for adding two vectors, further justifying the fact that it is permissible to consider points and vectors to be the same thing.

Similarly, since we have $\overrightarrow{PQ} = \overrightarrow{OQ} - \overrightarrow{OP}$, writing $Q - P$ for $\overrightarrow{PQ}$ presents no problems.