A book on CG says:
... we can construct any affine transformation from a sequence of rotations, translations, and scalings.
But I don't know how to prove it.
Even in a particular case, I found it still hard. For example, how to construct a shear transformation from a sequence of rotations, translations, and scalings?
Can you please help? Thank you.
EDIT:
Axis scalings may use different scaling factors for the axes.
Is there a matrix representation or proof for this?
For example, to show that a two-dimensional rotation can be decomposed into three shear transformation, we can write $$ \begin{pmatrix} \cos\alpha & \sin\alpha\\ -\sin\alpha & \cos\alpha \end{pmatrix} = \begin{pmatrix} 1 & \tan\frac{\alpha}{2}\\ 0 & 1 \end{pmatrix} \begin{pmatrix} 1 & 0\\ -\sin\alpha & 1 \end{pmatrix} \begin{pmatrix} 1 & \tan\frac{\alpha}{2}\\ 0 & 1 \end{pmatrix} $$
You can write any affine transformation
$$ \vec{x}'=A\vec{x}+\vec{t}\;, $$
where $A$ is any non-singular matrix, as follows:
$$ \left( \begin{array}{c} \vec{x}'\\ 1 \end{array} \right) = \left( \begin{array}{cc} A&\vec{t}\\ 0&1 \end{array} \right) \left( \begin{array}{c} \vec{x}\\ 1 \end{array} \right) \;. $$
This allows you to compose affine transformations by composing the corresponding matrices. In this approach, rotations, translations and axis scalings can respectively be written like this:
$$ \left( \begin{array}{cc} \Omega&0\\ 0&1 \end{array} \right) \;, $$
$$ \left( \begin{array}{cc} I&\vec{t}\\ 0&1 \end{array} \right) \;, $$
$$ \left( \begin{array}{cc} S&0\\ 0&1 \end{array} \right) \;, $$
where $\Omega$ is a rotation matrix, $I$ is the identity matrix and $S$ is a diagonal matrix with the scaling factors on the diagonal.
Given any affine transformation specified by $A$ and $\vec{t}$, you can split it into a translation and a linear part:
$$ \left( \begin{array}{cc} A&\vec{t}\\ 0&1 \end{array} \right) = \left( \begin{array}{cc} I&\vec{t}\\ 0&1 \end{array} \right) \left( \begin{array}{cc} A&0\\ 0&1 \end{array} \right) \;. $$
So now we just need to be able to write any non-singular matrix as a product of rotations and axis scalings. This is possible due to the singular value decomposition.