What are differences between affine space and vector space?

61.2k Views Asked by At

I know smilar questions have been asked and I have looked at them but none of them seems to have satisfactory answer. I am reading the book a course in mathematics for student of physics vol. 1 by Paul Bamberg and Shlomo Sternberg. In Chapter 1 authors define affine space and writes:

The space $\Bbb{R}^2$ is an example of a vector space. The distinction between vector space $\Bbb{R}^2$ and affine space $A\Bbb{R}^2$ lies in the fact that in $\Bbb{R}^2$ the point (0,0) has a special significance ( it is the additive identity) and the addition of two vectors in $\Bbb{R}^2$ makes sense. These do not hold for $A\Bbb{R}^2$.

Please explain.

Edit:

How come $A\Bbb{R}^2$ has point (0,0) without special significance? and why the addition of two vectors in $A\Bbb{R}^2$ does not make sense? Please give concrete examples instead of abstract answers . I am a physics major and have done courses in Calculus, Linear Algebra and Complex Analysis.

8

There are 8 best solutions below

8
On BEST ANSWER

Consider the vector space $\mathbb{R}^3$. Inside $\mathbb{R}^3$ we can choose two planes, as in the picture below. We'll call the green one $P_1$ and the blue one $P_2$. The plane $P_1$ passes through the origin but the plane $P_2$ does not. It is a standard homework exercise in linear algebra to show that the $P_1$ is a sub-vector space of $\mathbb{R}^3$ but the plane $P_2$ is not. However, the plane $P_2$ looks almost exactly the same as $P_1$, having the exact same, flat geometry, and in fact $P_2$ and $P_1$ are simply translates of one another. This plane $P_2$ is a classical example of an affine space.

$\,\,\,\,\,\,\,\,\,$enter image description here

Suppose we wanted to turn $P_2$ into a vector space, would it be possible? Sure. What we would need to do is align $P_2$ with $P_1$ using some translation, and then use this alignment to re-define the algebraic operations on $P_2$. Let's make this precise. If $T: P_2 \to P_1$ is the alignment, for $p,q \in P_2$ we'll define $p \oplus q = T^{-1}(T(p) + T(q))$. In words, we shift $p$ and $q$ down to $P_1$, add them, and then shift them back. Note that this is different than simply adding $p+q$, as this vector need not lie on $P_2$ at all (one of the reasons $P_2$ is not a vector space, it is not closed under addition).

There are, however, many ways of aligning $P_2$ with $P_1$, and so many different ways of turning $P_2$ into a vector space, and none of them are canonical. Here is one way to make these alignments: pick a vector $v \in P_2$, and translate $P_2$ by $-v$, so that $T(p) = p-v$. This translates $P_2$ on to $P_1$, and sends $v$ to $0$. Conceptually, this translation "sends $v$ to zero", and this approach of "redefining some chosen vector to be the zero vector" always works to turn an affine space into a vector space.

If you want to do algebra on $P_2$ without picking a "zero vector", you can use the following trick: instead of trying to trying to add together vectors in $P_2$ (which, as we've seen, need not stay in $P_2$), you can add vectors in $P_1$ to vectors in $P_2$. Note that if $v_1 \in P_1$ and $v_2 \in P_2$ then $v_1 + v_2 \in P_2$. What we obtain is a funny situation where the addition takes place between two sets: a vector space $P_1$ on the one hand, and the non-vector-space $P_2$ on the other. This lets us work with $P_2$ without having to force it to be a vector space.

Affine spaces are an abstraction and generalization of this situation.

9
On

The easiest way for me to tell the two structures apart is their axioms.

A vector space is an algebraic object with its characteristic operations, and an affine space is a group action on a set, specifically a vector space acting on a set faithfully and transitively.

Why do we say that the origin is no longer special in the affine space? The issue is that both $V$ and $X$ are usually written as $\Bbb R^n$, although we are thinking of each of the two copies of this in different ways. The deal is that the set $X=\Bbb R^n$ really doesn't distinguish any of its elements... they're all the same. But in the vector space $\Bbb R^n$, you can spot the origin right away, called out in the axioms.

Why do we say affine points can be subtracted but not added? That makes it seem like there are indeed operations within the affine space just like there are in the vector space, blurring the picture.

The reason is precisely because of transitivity: if $V$ acts on $X$ so that $X$ is an affine space (written additively), then for any $x,y\in X$, there is a $v$ such that $v + x = y$. I've written the group action additively here, but it is suggestive to rewrite this as $y-x = v$ and confuse the element $v$ of the vector space with an element of $X$.

0
On

An example: Consider an $(m\times n)$ system of linear equations: $$\sum_{k=1}^n b_{ik}\>x_k=c_i\qquad(1\leq i\leq m)\ ,\tag{1}$$ where $d:=n-{\rm rank}(B)\geq1$, and ${\bf c}\ne{\bf 0}\in{\mathbb R}^m$. When this system has at least one solution ${\bf x}_p$ ($p$ for "particular") then the full set of solutions is a $d$-dimensional affine space $A\subset{\mathbb R}^n$. Two points in $A$ cannot be added to produce a new point in $A$, nor can points be scaled in $A$, and there is no distinguished point in $A$ that may serve as origin.

However you can say the following: Having found a point ${\bf x}_p\in A$ by whichever means you can declare this point as "origin" of $A$ and then introduce in $A$ coordinates as follows: The homogeneous system $$\sum_{k=1}^n b_{ik}\>x_k=0\qquad(1\leq i\leq m)$$ associated to $(1)$ has $d$ linearly independent solutions ${\bf f}_j\in{\mathbb R}^n$ $\>(1\leq j\leq d)$, and the set $A$ can then be written as $$A=\left\{{\bf x}_p+\sum_{j=1}^d y_j{\bf f}_j\ \Biggm|\ y_j\in{\mathbb R} \quad (1\leq j\leq d)\right\}\ .$$ The $y_j$ can then serve as coordinates in $A$, so that $A$ looks as it were a $d$-dimensional coordinate space. But note that "addition" in this space refers to the chosen point ${\bf x}_p$, and not to the origin of the base space ${\mathbb R}^n$.

1
On

Consider an infinite sheet (of idealised paper, if you like). If it is blank, then there is absolutely no way to distinguish between any two points on the sheet. Nonetheless, if you do have two points on the sheet, you can measure the distance between them. And if there is a uniform magnetic field parallel to the sheet, then you can even measure the bearing from one point to another. Thus, given any point $P$ on the sheet, you can uniquely describe every other point on the sheet by its distance and bearing from $P$; and conversely, given any distance and bearing, there is a point with that distance and bearing from $P$. This is the situation that the notion of a 2-dimensional affine space is an abstraction of.

Now suppose we have marked a point $O$ on the sheet. Then we can "add" points $P$ and $Q$ on the sheet by drawing the usual parallelogram diagram. The result $P + Q$ of the "addition" depends on the choice of $O$ (and, of course, $P$ and $Q$), but nothing else. This is what the notion of a 2-dimensional vector space is an abstraction of.

0
On

Vector spaces and Affine spaces are abstractions of different properties of Euclidean space. Like many abstractions, once abstracted they become more general.

A Vector space abstracts linearity/linear combinations. This involves the concept of a zero, scaling things up and down, and adding them to each other.

An Affine space abstracts the affine combinations. You can think of an affine combination as a weighted average, or a convex hull (if you limit the coefficients to be between 0 and 1).

As it turns out, you do not need a zero, nor do you need the concept of "scaling", nor do you need full on addition, in order to have a concept of weighted average and convex hull within a space.

Now, you can take your affine space $\mathbb {A}$ , pick any point $o$ from it, and talk about ${\mathbb A}-o$ as a vector space.

Mapping your $n$ dimensional affine space over $\mathbb {R}$ to $\mathbb{R}^n$ is in effect picking a point, and mapping it to a space with more structure than your original affine space. So you end up with the origin $o$ appearing special, but that is an artifact of your mapping.

If you look at the Earth, the lines of longitude have a zero point, but that zero point is arbitrary -- it has no meaning. The lines of longitude are an affine space. We measure them in degrees (or radians), and we have picked a zero, but other than it being useful to agree where the zero is, it isn't a special line.

The space of rotations around a circle, on the other hand, have a zero that is meaningful -- zero means you don't rotate. We measure them as a vector space.

The lines of longitude are measured as rotations away from our arbitrary point we assigned zero. But what matters about them is the ability to say how far apart two longitude are from each other, not any one line's absolute value.

If we where doing some math and it would be useful to move the zero of longitude, we are free to do so. But if we want to move the zero in the space of rotation (to say bending things 90 degrees) we are not nearly as free.

In general, your location is an affine space, as there is no special place, and scaling your location by a factor of 3 makes no sense, and adding two locations makes no sense -- but taking the average of two locations makes sense.

The (directed) distance between locations is a vector space. Saying something is twice as far as another distance makes sense, the "same place" (distance zero) makes sense, and adding two directed distances together makes sense.

And you can pick a spot and describe locations as the directed distance from that particular spot, but the spot picked was arbitrary, and if it would be useful to pick a different spot, you are free to.

0
On

A subset $A$ of a vector space $X$ is affine if for any two points $x,y\in A$, the line $\ell$ through $x$ and $y$ is contained in $A$. That is, $A$ is affine if $$\alpha x+(1-\alpha)y\in A$$ for all $x,y\in A$ and $\alpha\in\mathbb{R}$. This definition is equivalent to the axiomatic ones which may obscure the idea at first reading.

Example 1. of an affine space in $\mathbb{C}^m$ is the set of solutions to the equation $Ax=b$, where $A$ is an complex $n\times m$ matrix.

Example 2. A line in any vector space is affine.

Example 3. The intersection of affine sets in a vector space $X$ is also affine. Given a set $C\subset X$, the smallest affine set containing $C$ -denoted by $\operatorname{aff}( C )$- is the intersection of all affine sets in $X$ that contained $C$. It is easy to check that $$\operatorname{aff}( C )=\Big\{\sum^n_{k=1}\alpha_k x_k: n\in\mathbb{N},x_k\in C,\alpha_k\in\mathbb{R},\sum^n_{k=1}\alpha_k=1\Big\}.$$

0
On

I have read some intuitions on the difference, although they are fine to me, I suggest this probably simpler explanation, which is the one I would like to read. I hope it can help someone:

Assuming you understand what a vector space is, we can define an affine space as follows:

An affine space $A$ is a set of elements (called points) with a difference function. This difference is a binary function, which takes two points $p$ and $q$ (both in $A$) and yields an element (a vector) $v$ of a vector space $V$ (for each unique $A$, there is an unique $V$, which is the vector space associated to $A$). We write $v=p-q$. Additionally, this difference function must ensure that, for any point $p$ in $V$, it holds $p-p=0$, where $0$ is the null vector of $V$.

The first difference (which arises to me) between affine and vector space is that this affine space definition does not mention any origin point for the affine space (the affine space has no one), while each vector space has an origin (the null vector). In geometric terms, this implies that, if $v$ is a vector, we can define a line as the set of all vectors parallel to $v$, that is the set $tv$ for all reals $t$. This line always passes through the origin (for $t=0$).

However, a single point in an affine space does not have an associated line as in a vector space. To build a line in an affine space, we need two points, $p$ and $q$. Then the line going through $p$ and $q$ is the set of points $p+t(q-p)$, for every real $t$. This need for two points comes from the lack of an origin in the affine space.

0
On

There’s quite a nice example of an affine space which you are probably familiar with: time. There are two distinct relevant notions of time here, which I’m going to call duration and instant. (In Python, these are called timedelta and datetime, respectively.)

A duration is the length of a period of time, such as ‘this kettle will take 100 s to boil’. These times form a real vector space. You can add them. For example, filling the kettle takes 30 s, so making boiling water takes 130 s.

An instant is a specific moment. For example, now. Or now. We can also represent these with a number, such as Unix time, which is (ignoring leap seconds) the duration between the start of 1970 and now. Instants form an affine space over durations. Now, the choice of 1st January 1970 was totally arbitrary, so it being represented by zero doesn’t really mean much. Adding instants doesn’t make much sense. If I try to add 1st January 1980 to 1st January 1975 I get (roughly) 1st January 1985. That’s meaningless, because if we’d arbitrarily chosen a different origin, we’d have had a different result. So we ban adding instants, and we ban anything else that depends on our arbitrary choice of origin. We can still do some things. We can subtract instants, to get the duration between them, and we can use this to take linear combinations whose coefficients sum to one. This lets us do things like find the instant halfway between two other instants. But adding them or scaling them isn’t a meaningful operation