I would like to ask a question about affine matrix transformations, specifically a mathematical explanation about why these two ways to interpret them are equivalent.
Suppose I have a translation matrix T (for simplicity 2D) and a rotation matrix R as follows:
$$ T= \begin{bmatrix} 1 & 0 & tx \\ 0 & 1 & ty \\ 0 & 0 & 1 \\ \end{bmatrix} R= \begin{bmatrix} \cos \theta & -\sin \theta & 0 \\ \sin \theta & \cos \theta & 0 \\ 0 & 0 & 1 \\ \end{bmatrix} $$
Suppose: $$ tx = 0 , ty = 2, \theta = 45^{ \circ } $$
Now if I transform a quad applying translation and rotation to every vertex in this order: R * T * Vertex (Column vector), I obtain this:


Since both translation and rotation transforms are done around the origin, considering T first and R after. However this is equivalent to considering R first and T after modifying the quad relative to itself and not the origin:

The result is the same. Why is this equivalent?
It depends on the agreement. Usually it is preferred to read $(f*g)(x)$ as $f(g(x))$ when we talk about functions, operators and etc., but You can put it as You like, generally, with remark about your own agreement.