In (linear) algebra, why are linear mappings identified with matrix multiplication from **the left**?

70 Views Asked by At

One thing that has bothered me for a long time when it comes to algebra is seeming more and more unreasonable as I learn more advanced topics in mathematics. It all boils down to how linear transformations in $\mathbb{C}^n$ are represented by matrix multiplication from the left.

Why is this the standard in favour of representing linear mappings by matrix multiplication from the right? The more I think about it, the less sense it actually makes. For instance, let's say that the two matrices $A$ and $B$ represent two linear transformations $L_A$ and $L_B$ in $\mathbb{C}^n$, respectively. Then we have that the transformation $L_A\circ L_B$ is represented by the matrix $AB$.

In my experience, the most intuitive way of interpreting the composition $L_A\circ L_B$ in words would be "$L_A$ followed by $L_B$". This would also be the exact meaning of the expression if we think of $L_A$ and $L_B$ as operators acting from the right. But with the interpretation of $L_A$ and $L_B$ as operators acting from the left, it is supposed to be interpreted as "$L_B$ followed by $L_A$".

I can think of two reasons why the left-interpretation is the standard. The first is that it could be an artifact from high school mathematics, where functions $f$ are often written as $f(x)$ instead of $(x)f$ to be interpreted as "$f$ of $x$". In this setting, it makes sense to write it in this way, because in this case a composition $(g\circ f) (x)$ is read as "$g$ of $f$ of $x$". Another advantage in linear algebra of the left interpretation is in the study of systems of linear equations, where it becomes easier to identify the system with a matrix equation if we use column vectors instead of row vectors.

However, when things get more abstract and we get more interested in the mappings themselves, it seems to me that the right-interpretation is just superior. For instance, if we write $f_{A,B}$ to mean a mapping from $A$ to $B$, then if we use the right-interpretation it is trivial to see that

$f_{A,B}\circ g_{B,C}\circ h_{C,D}$

is a valid composition (By simply looking at the sequence $A$ to $B$, $B$ to $C$, $C$ to $D$). In contrast, if we try to write the same composition with the left-interpretation it becomes

$h_{C,D}\circ g_{B,C}\circ f_{A,B}$,

which takes far longer to verify as a valid composition, especially as the number of mappings in the composition increases.

So, why aren't the right-interpretation of morphisms more standard when it comes to more abstract topics, where the morphisms themselves are the subject of interest? It seems to me that notation should always strive to be as intuitively clear as possible, so that it becomes easier to think about the core concepts discussed rather than being confused by suboptimal notiation.

1

There are 1 best solutions below

2
On

I was told that writing $g\circ f$ for the composite $f$ followed by $g$ indeed stems from the notation with parameters $g(f(x))$. As alot of abstract mathematics still works with sets and their elements I think it is good to stick with this notation. I didn’t know your argument about identifying linear equations and matrices, but I think it is a very strong argument in favor of this notation as well.

In category theory (which is all about morphisms and neglecting elements) there is some tendency to work with diagrammatic composition, ie. writing $f\circ g$ because it makes comparisons between formulas and commutative diagrams easier. However I feel like formulas should be replaced by commutative diagrams in category theory as much as possible , so the question whether $g\circ f$ or $f \circ g$ is the right way to write composites becomes less important. I mean, why not just write consider the composite $$A \xrightarrow{f} B \xrightarrow{g} C?$$

Finally there is a nice workaround for the problem regarding subindices: just write them in the wrong direction as well! Writing $$g_{CB} \circ f_{BA} = (g \circ f)_{CA}$$ is as easy to identify as $$f_{AB} \circ g_{BC} = (f \circ g)_{AC}$$ When I learned linear algebra my professor used the notation $M_{BA}$ for the base change from $A$ to $B$ which fit very well with matrix multiplication from the left and the nondiagrammatic way to write composition.