Affine 3D transformations can be expressed in homogeneous coordinates by a matrix $M \in \mathbb{R}^{4 \times 4}$. This means we have 16 parameters to calculate.
The first thing I asked myself is how many 3D points we need to define such a transformation.
Each point has 3 coordinates and thus each point gives 3 equations. I thought we would need
$$\left \lceil \frac{\overbrace{3 \cdot 3}^{\text{linear}} + \overbrace{3}^{\text{translation}}}{\underbrace{3}_{\text{equations per point}}} \right \rceil = 4$$
points to define $M$. However, I also realize we are speaking of a $4 \times 4$ matrix having 16 entries, not 12. I guess the remaining 4 entries are for the projection?
I thought the strucutre of the matrix was
$$\begin{pmatrix}A & t\\ \vec 0 & 1\end{pmatrix}$$
where $A \in \mathbb{R}^{3 \times 3}$ is a linear transformation, $t \in \mathbb{R}^{3 \times 1}$ is a transposition, $\vec 0 \in \mathbb{R}^{1 \times 3}$ is a 0-vector. But this would not project anything. I guess for projection the $\vec 0$ is replaced by something different? Is there a simple way to describe it?
Now I read slides of a computer graphics lecture which says
How many points define a 3D transformation uniquely?
$3 \times 4$ entries of a $4 \times 4$ matrix (linear part and translation) resulting in 12 unknowns, hence 4 points
[...]
A projection in 3D is defined by a mapping of 5 points in $\mathbb{R}^3$
- $4 \times 4$ matrix, but homogeneous coordinates are invariant to scaling
- $5 \times 3 = 4 \times 4 - 1$
Source: German slides by Prof. Dr. Ing. Carsten Dachsbacher for Computer Graphics. (I translated them for this question.)
Now I'm confused. Where those the "5" in the slides come from? Why $4 \times 4 - 1$?