Why are homogenous coordinates needed in image projection?

Question

Why are homogenous coordinates needed in image projection?

5.8k Views Asked by Bumbble Comm At 03 Apr 2026 - 10:25

The above image shows how a 3D object is projected onto a 2D image by a camera. Which makes perfect sense to me.

However it's then said that division by z is non linear (why?), so homogenous coordinates instead of euclidean coordinates should be used instead. Why does this "trick" help?

When transforming from 3D object space to homogenous coordinates, the final coordinate and final column in the transformation matrix isn't even used so why is it necessary?

Original Q&A

There are 2 best solutions below

**Bumbble Comm** · Answer 1 · 2015-12-10 22:59:45

The first transformation (in red) is non-linear. Note that you cannot multiply a matrix with $(x_s, y_s, z_s)$ to get $(x_i, y_i) = (f \frac{x_s}{z_s}, f \frac{y_s}{z_s})$. However, if you use a homogeneous coordinate system, then you can represent such transformation as linear function (the matrix product in the question colored in green).

Note that in homogeneous coordinate system $\forall a \neq 0, (ax, ay, a)$ refers to the same point and is represented by $(x, y, 1)$. So, the matrix product in green transforms the point $(x_s, y_s, z_s)$ to ... $$ \left( \begin{array}{c} u = f x_s \\ v = f y_s \\ w = z_s \end{array} \right) = \left( \begin{array}{c} f \frac{x_s}{z_s} \\ f \frac{y_s}{z_s} \\ 1 \end{array} \right) $$

**Bumbble Comm** · Answer 2 · 2016-01-05 08:42:22

It isn’t so much that homogeneous coordinates are needed. Rather, they make computation simpler by making many important transformations linear. They can then all be represented as $4\times4$ matrices and composition is just matrix multiplication. It’s unfortunate that this source presents the use of homogeneous coordinates as a mere trick since there’s a natural connection between them and projective geometry.

Even before projective transformations were introduced, you probably already had to deal with rotation, translation and scaling. For instance, the object-to-world and world-to-camera coordinate transformations often involve rotations and translations. Although rotations about the origin in $\mathbb R^3$ are linear transformations and so can be represented as $3\times 3$ matrices, neither translations nor rotations about an arbitrary point (which are just a composition of a rotation about the origin and a couple of translations) can be. By introducing homogeneous coordinates, you can represent all affine transformations of $\mathbb R^3$ as $4\times4$ matrices. This includes all of the linear transformations that were possible with $3\times3$ matrices as well as translations (and combinations thereof). For example, a translation by $(\Delta x,\Delta y,\Delta z)$ is represented by the matrix $$ \pmatrix{1&0&0&\Delta x \\ 0&1&0&\Delta y \\ 0&0&1&\Delta z \\ 0&0&0&1}. $$

Moving to perspective projection, we place the center of projection—pinhole or eye—at the origin in camera coordinates, locate the image plane perpendicular to the $z$-axis at $(0,0,f)$, with $f<0$, and direct the line of sight along the negative $z$-axis. (This arrangement keeps the image plane coordinate system right-handed.) The ray to a point $P$ then intersects the image plane at the point $\left(f\frac{P_x}{P_z},f\frac{P_y}{P_z},f\right)=\frac f{P_z}\left(P_x,P_y,P_z\right)$. This is just a form of scaling transformation, but unfortunately it’s non-linear because the scale factor depends on the point’s $z$-coordinate. That is, if you tried to write this transformation as a $3\times 3$ matrix, it’d be something like $$\pmatrix{\frac f{P_z}&0&0 \\ 0&\frac f{P_z}&0 \\ 0&0&\frac f{P_z}},$$ but that doesn’t give you a linear transformation because the entries of this matrix aren’t constants—one of the coordinates of the point being transformed appears in it.

Notice something interesting here: Homogeneous coordinates have the property that the tuples $(x_1, x_2, \dots, 1)$ and $(\alpha x_1, \alpha x_2, \dots, \alpha)$ represent the same point (for $\alpha\ne0$). If you interpret the coordinates of a point in $\mathbb R^3$ as the homogeneous coordinates of a point in the image plane, then the perspective projection is just the identity map! This is no coincidence. When you use homogeneous coordinates, you’ve actually moved from the Euclidean space $\mathbb R^{n+1}$ to the projective space $\mathbb P^n(\mathbb R)$ by identifying points in $\mathbb R^{n+1}$ that lie on the same line through the origin (minus the origin itself) with each other. I.e., all of the points that have the same projection are an equivalence class that is a single point in the projective space.

Taking advantage of this scaling property, we can write the coordinates of the projected point on the image plane as the homogeneous coordinates $\left(P_x,P_y,P_z,\frac{P_z}f\right)$, which can be obtained from $P=(P_x,P_y,P_z,1)$ by applying the linear transformation given by the matrix:$$ \pmatrix{ 1&0&0&0 \\ 0&1&0&0 \\ 0&0&1&0 \\ 0&0&\frac1f&0 }. $$ The upper-left submatrix is indeed the identity map, as noted above.

Using this representation of perspective projection, you can continue to combine all of the necessary image transformations by simply multiplying their matrices together. This representation has other advantages besides. This homogeneous perspective matrix can be modified slightly so that the transformed $z$-coordinate encodes depth for $z$-buffering. Also, since you can multiply by an arbitrary scale factor without changing its effect, it can be tweaked to make the view volume after projection a cube with sides at $-1$ and $+1$, which makes clipping easy. Historically, when floating-point operations were very expensive compared to integer operations, homogeneous coordinates allowed you to use integer (really rational) arithmetic by carrying a denominator along in the last coordinate. This isn’t as important any more, although being able to rescale the coordinates can still be handy.

Why are homogenous coordinates needed in image projection?

There are 2 best solutions below

Related Questions in COORDINATE-SYSTEMS

Related Questions in LINEAR-TRANSFORMATIONS

Trending Questions

Popular # Hahtags

Popular Questions