Reason behind the condition for multiplication of matrices

73 Views Asked by At

Why is the product of two matrices defined only when the number of columns in $A$ is equal to the number of rows in $B$?

2

There are 2 best solutions below

0
On

The product of matrices $BA$ is only well defined when the number of columns of $A$ equals the number of rows of $B$ because that is how matrix multiplication is defined. The real question is then why is matrix multiplication defined this way?

If we have a $n \times m$ matrix $A$ (with entries in $\mathbb{R}$) then we can define a linear transformation $T: \mathbb{R}^m \rightarrow \mathbb{R}^n$ by $T(\textbf{x}) = A\textbf{x}$.

Likewise, if we have a $m \times l$ matrix $B$ then we can define a linear transformation $S: \mathbb{R}^m \rightarrow \mathbb{R}^l$ by $S(\textbf{x}) = B\textbf{x}$.

Of course, we can compose these linear transformations to get another linear transformation $S \circ T: \mathbb{R}^m \rightarrow \mathbb{R}^l$. This map is only well defined because the codomain of $T$ equals the domain of $S$.

But $S \circ T(\textbf{x})=S(A\textbf{x})=B(A\textbf{x})=(BA)\textbf{x}$.

We can now see that $S \circ T$ is only a well defined linear transformation when the number of columns of $A$ (the codomain of $T$) equals the number of rows of $B$ (the domain of $S$).

Matrix multiplication was defined in the way it is specifically so that composition of the corresponding linear transformations corresponds to matrix multiplication.

2
On

The answer provided by Winged Reptile is well done ($+1$). But allow me to say something more.

Sometimes, in a Linear Algebra course, ones starts by studying the algebra of matrices, before studying what a linear transformation is (as it was in my case).

Of course that when one starts to study linear transformations, all becomes clearer. But if you haven’t yet study linear transformations, you can think as follows.

Given two vectors of dimension $n$, you can compute the inner product of those two vectors. For example, if $u = (u_1,u_2,\dots,u_n)$ and $v=(v_1,v_2,\dots,v_n)$, with the usual inner product, you get

\begin{align*} u \cdot v = u_1 v_1 + u_2 v_2 + \dots + u_n v_n = \sum_{i=1}^n u_iv_i. \end{align*}

Hence, you can think of the product of two matrices as a generalization of this inner product. The ideia is that one should compute the inner product of the rows vectors of the first matrix with the column vectors of the second matrix.

Since the inner product is only defined for vector with the same dimensions, we require that the number of columns of the first matrix to be equal to the number of rows of the second matrix.

Once again, I point that the explanation is given by the fact that in this way, we can compute the composition of two linear transformations by multiplying the two matrices associated to those linear transformations. But if you haven’t yet hear about linear transformations, this is a nice way to get a sense of the definition of product of matrices.