Transforming coordinate system vs objects

933 Views Asked by At

In computer graphics it's pretty common to assume the camera is always positioned at the origin and oriented in one direction. In case we want to move the camera closer to an object in the world coordinate system, we instead move the whole world away from the camera. The same thing goes with rotations - instead of rotating the camera counterclockwise, we rotate the world clockwise and so on.

The concept is very intuitive and makes perfect sense, but my question is: how can we prove mathematically (using as little advanced math as possible, just enough to accept it as a proof) that transforming the coordinate system without moving the objects is eqivalent to an inverse transformation applied to the objects in this coordinate system without moving the coordinate system?

1

There are 1 best solutions below

6
On BEST ANSWER

Great question. The trick is to understand what you mean by "equivalent to". Clearly, transforming coordinates doesn't move anything, and moving the object DOES move things, so they're not "equivalent" at the level of, say, physics of moving bodies.

I claim that your notion of "equivalence" is that a picture made in one circumstance -- say a raytaced picture -- will be the same as the picture made in the other circumstance. And by "the same", I mean "a ray traced through a pixel with certain coordinates will hit the same point of the same object in both cases", where by "the same point", I mean "if the object were a person, for instance, you'd hit the bottom of the left earlobe in both cases", not "you'd hit location (37, 11, 22.4) in world-coordinates", i.e., the same intrinsic point.

So suppose that $E$ is the "eye" and $A$ is a pixel of the film-plane, and $P$ is a point of some object in the scene that's seen from $E$ looking through pixel $A$. Let's suppose that $e, a, p$ are the "homogeneous coordinates" for these points in some coordinate system (i.e., they have the form $\begin{bmatrix} a\\b\\c\\1\end{bmatrix}$). Now the assumption that $E$ sees $P$ at $A$ means that $A$ is on the line between $E$ and $P$. In coordinates, it means that the vector $p-e$ is a multiple of $a - e$, say $\alpha (a-e)$.

With that in mind, let's transform the objects in the world by an affine transformation $T$ (i.e., a translation combined with rotation and scaling). That amounts to multiplying the coordinates by some $4 \times 4$ matrix $\mathbf M$.

Now let's suppose that the eye $E$, looking through pixel $A$, sees the transformed point $S = T(P)$. Then the reasoning above says that $\mathbf Mp - e$ is a multiple $\alpha (a - e)$ for some value $\alpha $. (Note that the coordinates of $S$ are $s = \mathbf M p$. ) Hold that thought. This is a statement about how the coordinates, in the original coordinate system, of $A$, $E$, and $T(P)$ are related.

Let's instead now look at the coordinates of those points in a different coordinate system -- one transformed by $\mathbf M^{-1}$. The coordinates, $e'$ of $E$ in that new system are $$ e' = \mathbf M^{-1}e, $$ and those for $A$ and $T(P)$ are \begin{align} a' &= \mathbf M^{-1}a \\ s' &= \mathbf M^{-1}s \\ &= \mathbf M^{-1} (\mathbf Mp)\\ &= p. \end{align}

The question that we now have to ask is "is $s'-e'$ equal to $\alpha (a' - e')$? Well,

\begin{align} s' - e' &= p - \mathbf M^{-1}e \\ &= \mathbf M^{-1} (\mathbf M p - e) \\ &= \mathbf M^{-1} (s - e) \\ &= \mathbf M^{-1} \alpha (a - e) \\ &= \alpha \mathbf M^{-1} (a - e) \\ &= \alpha \mathbf (a' - e'). \end{align}

and we're done.