Why is matrix multiplication defined a certain way?

2k Views Asked by At

Why is it that when multiplying a (1x3) by (3x1) matrix, you get a (1x1) matrix, but when multiplying a (3x1) matrix by a (1x3) matrix, you get a (3x3) matrix? Why is matrix multiplication defined this way?

Why can't a (1x3) by (3x1) yield a (3x3), or a (3x1) by (1x3) yield a (1x1)? I really would like to get to the root of this problem or 'axiomatization'. Thanks.

2

There are 2 best solutions below

4
On BEST ANSWER

The idea is that a matrix represents a linear map of finite-dimensional vector spaces. A (3x1) matrix "is" a linear map $\Bbb{R} \to \Bbb{R}^3$, and so on...

Multiplying matrices amounts to composing these functions. The rules of matrix multiplication you ask about are tha classical rules of function composition. (if $f:E \to F$ and $g:F\to G$ then $g\circ f : E \to G$.)

Long story short, you need to study the relationship between matrices and linear maps.

2
On

Suppose \begin{align} p & = 2x + 3y \\ q & = 3x - 7y \\ r & = -8x+9y \end{align} Represent this way from transforming $\begin{bmatrix} x \\ y \end{bmatrix}$ to $\begin{bmatrix} p \\ q \\ r \end{bmatrix}$ by the matrix $$ \left[\begin{array}{rr} 2 & 3 \\ 3 & -7 \\ -8 & 9 \end{array}\right]. $$ Now let's transform $\begin{bmatrix} p \\ q \\ r \end{bmatrix}$ to $\begin{bmatrix} a \\ b \end{bmatrix}$: \begin{align} a & = 22p-38q+17r \\ b & = 13p+10q+9r \end{align} represent that by the matrix $$ \left[\begin{array}{rr} 22 & -38 & 17 \\ 13 & 10 & 9 \end{array}\right]. $$ So how do we transform $\begin{bmatrix} x \\ y \end{bmatrix}$ directly to $\begin{bmatrix} a \\ b \end{bmatrix}$?

Do a bit of algebra and you get \begin{align} a & = \bullet x + \bullet y \\ b & = \bullet x + \bullet y \end{align} and you should be able to figure out what numbers the four $\bullet$s are. That matrix of four $\bullet$s is what you get when you multiply those earlier matrices. That's why matrix multiplication is defined the way it is.