Showing matrix multiplication is associative via linear mappings.

93 Views Asked by At

Exercise.

Prove that matrix multiplication is associative. In other words, suppose $A, B$, and $C$ are matrices whose sizes are such that $(AB)C$ makes sense. Explain why $A(BC)$ makes sense and prove that $$(AB)C = A(BC)$$


Source.

Linear Algebra Done Right, Sheldon Axler, 4th edition.


Comments.

Axler suggests that the proof can be made clean by essentially doing away with the matrices themselves. I'm assuming what he means is to introduce linear maps somehow, rather than showing the entries of $(AB)C$ equal the entries of $A(BC)$, as this solution does: https://math.stackexchange.com/a/3240932/645756

Although, I would say the above solution is clean enough, so not sure why this exercise warrants a "cleaner" proof.

Anyway, I thought at first the proof for this would be exactly this: Proving the associativity of Matrix Multiplication, however, the exercise I have posted above appears in a section before Axler introduces bijections, which is needed to assert that compositions of linear maps are one-to-one corresponding to matrix multiplication.

Thus, the only thing that comes to mind is to use the following result from the section this exercise comes from: enter image description here

where $T$ is a linear mapping from a finite-dimensional vector space $U$ to a finite-dimensional vector space $V$, $S$ is a linear mapping from a finite-dimensional vector space $V$ to a finite-dimensional vector space $W$, and $\mathcal{M}(T)$, $\mathcal{M}(S)$ and $\mathcal{M}(ST)$ represent the matrices of the linear mappings $T, S$ and $S \circ T$ respectively.

My proof below reflects this idea; the proof assumes the knowledge that linear mapping composition is associative. I omit explaining why $A(BC)$ makes sense as that is not the subject of my comments nor my questions below.


My Proof.

Given matrices $A, B, C$, let $\mathcal{M}(S)$ correspond to matrix $A$, $\mathcal{M}(T)$ correspond to matrix $B$, and $\mathcal{M}(E)$ correspond to matrix $C$ such that $\mathcal{M}(S)$, $\mathcal{M}(T)$ and $\mathcal{M}(E)$ adhere to the following definition: enter image description here

and such that $(\mathcal{M}(S)\mathcal{M}(T))\mathcal{M}(E)$ is well-defined. Then we have:

\begin{align} (AB)C &= (\mathcal{M}(S)\mathcal{M}(T))\mathcal{M}(E) \tag{1} \\ &= (\mathcal{M}(ST))\mathcal{M}(E) \tag{2} \\ &= \mathcal{M}(ST)\mathcal{M}(E) \tag{3} \\ &= \mathcal{M}((ST)E) \tag{4} \\ &= \mathcal{M}(S(TE)) \tag{5} \\ &= \mathcal{M}(S)\mathcal{M}(TE) \tag{6} \\ &= \mathcal{M}(S)(\mathcal{M}(TE)) \tag{7} \\ &= \mathcal{M}(S)(\mathcal{M}(T)\mathcal{M}(E)) \tag{8} \\ &= A(BC) \end{align}

as was to be shown.


Questions.

  1. I'm concerned I created the correspondence of $A, B, C$ with $\mathcal{M}(S), \mathcal{M}(T), \mathcal{M}(E)$ out of thin air, without describing any bases. Am I right to be concerned about this, or does it suffice to say that they satisfy the definition above and that $(\mathcal{M}(S)\mathcal{M}(T))\mathcal{M}(E)$ is well-defined, as I have done?
  2. Assuming the above is fine, does line $(8)$ really follow from line $(7)$ in my proof? I basically introduced the parenthesis in the first place in line $(7)$ from line $(6)$ just so I can have the result I want. But, is it valid?
  3. Assuming my proof is total garbage, how else does Axler suggest we prove this?