Can you multiply 3 matrices simultaneouly?

4.3k Views Asked by At

I know that the algorithm for multiplying 2 matrices is defined as:

$$(AB)_{ij} = (\text{row }i\text{ of matrix }A) ⋅ (\text{column }j\text{ of matrix }B)$$

And I know that matrix multiplication is associative. So in the case of 3 matrices:

$$(A ⋅ B) ⋅ C = A ⋅ (B ⋅ C)$$

Is there an algorithm for matrix multiplication that can multiply 3 matrices simultaneously?

For example, could one evaluate the expression $ABC$ without first evaluating either of:

  • $(AB) ⋅ C$
  • $A ⋅ (BC)$

Additional questions

I am in high school, so I am not sure if this is the correct terminology, but I read somewhere about binary operations, where for instance, an operation such as normal multiplication takes 2 elements (such as 2 real numbers) and produces an output, and you can't multiply 3 numbers together simultaneously.

  • Is matrix multiplication a binary operation, where you can't multiply 3 matrices simultaneously?
  • Is $A ⋅ B ⋅ C$ defined as $(A⋅ B)⋅ C$ or $A⋅ (B⋅ C)$?
  • Is the omission of the parentheses in $A⋅ B⋅ C$ (matrices) just a form of notation?
  • What is the most precise way to interpret expressions of multiplication or addition (associative operations) with more than 3 variables - where there are no brackets, such as $A⋅B⋅C⋅D$ ?
4

There are 4 best solutions below

7
On

There is a way to compute these matrices directly with notation involving summation. Assume the result of matrices to be $Y$ then: $$Y_{i,j}=\sum_{k_1=1}^n \sum_{k_2=1}^n A_{i,k_1}B_{k_1,k_2}C_{k_2,j}$$

This direct way is suitable opposed to the example in cited answer. Assuming $2×2$ matrix, you would need $4 × 2× 1=8$ additions and $4×2×2=16$ multiplications in the normal way of multiplying two matrices and then multiplying the resultant with other. --

Referring to comment by @r.e.s. the method involves more arithmetic operations.

4
On

There are already some very good answers. Just a quick note to mention that although matrix multiplication is associative (and so $ABC=A(BC)=(AB)C$) the finite precision result as well as the computational cost may depend on the order of the operations. Regarding the second point, there may be massive differences in cost and so some software libraries actually define an N-ary operation in addition to the standard binary matrix multiplication. This N-ary operation will then perform the binary operations in the order that minimises the number of FLOPs. See for example numpy.linalg.multi_dot, which decides the order of multiplication using the algorithm [1]. The corresponding optimisation problem is called matrix chain ordering problem. That Wikipedia page has the following example:

If A is a 10 × 30 matrix, B is a 30 × 5 matrix, and C is a 5 × 60 matrix, then

  • computing (AB)C needs (10×30×5) + (10×5×60) = 1500 + 3000 = 4500 operations, while
  • computing A(BC) needs (30×5×60) + (10×30×60) = 9000 + 18000 = 27000 operations.

[1] Cormen, "Introduction to Algorithms", Chapter 15.2, p. 370-378.

15
On

Is matrix multiplication a binary operation, where you can't multiply 3 matrices simultaneously and A⋅ B⋅ C is defined as (A⋅ B)⋅ C or A⋅ (B⋅ C)?

It is a binary operation, and yes in general that does mean that "A·B·C" could be considered to be a syntactically invalid expression, just like "gxjb eijalfv" is an invalid English expression. However, almost all binary operations in mathematics can be used in a chain and interpreted as being performed one at a time, either left to right or right to left. That is to say, rigorously we need to define "A·B·C" to denote "(A·B)·C" or "A·(B·C)".

I also wanted to clarify the following, is: A⋅ B⋅ C = (A⋅ B)⋅ C = A⋅ (B⋅ C)? Is the omission of the brackets in A⋅ B⋅ C, just a form of notation?

Yes to both. I have explained the second above. For the equality, it has not been mentioned in the other answers that matrices can be understood as linear functions and matrix multiplication is simply composition, and so associativity is trivial:

  ((f∘g)∘h)(x) = (f∘g)(h(x)) = f(g(h(x))) = f((g∘h)(x)) = (f∘(g∘h))(x).

From the linked post you can see that the definition of matrix multiplication itself is forced by the desired meaning of the matrix as a linear function. Similarly you can easily derive the summation in c00lSillyKid's answer from the multiple function composition, since you can trace all the possible paths:

$\pmatrix{x_3\\y_3\\z_3} \xleftarrow{A} \pmatrix{x_2\\y_2\\z_2} \xleftarrow{B} \pmatrix{x_1\\y_1\\z_1} \xleftarrow{C} \pmatrix{x_0\\y_0\\z_0}$

Every term that contributes to $x_3$ comes from $x_2,y_2,z_2$ with weights given by the first row of $A$, and the column matches which of $x_2,y_2,z_2$ it came from. Repeating that reasoning all the way yields the same product of matrix entries $A_{i,k_1}·B_{k_1,k_2}·C_{k_2,j}$ for each possible path (positions $i,k_1,k_2,j$ in the vectors above).

Clearly, this generalizes to the product of any number of matrices.

0
On

Whilst you could directly compute each individual element of the resulting matrix directly from the components of the original matrices, it would be grossly inefficient to do so.

Fairly obviously, multiplying $n+1$ square matrices together in the traditional manner (as repeated pairwise matrix multiplications), simply requires $n$ times as many of each operation as a single pairwise multiplication.

However multiplying the $n+1$ matrices together "directly" (computing each element as a sum of products) gets much worse very quickly - it goes up approximately as an exponential of $n$. If you then decide to factorise these sums, you eventually wind up back at something of comparable complexity to pairwise matrix multiplications.

If you only have binary multiplication and binary addition as your basic operations, then the following table indicates how many operations are required. (If you have vector operations, they're both faster, but the comparison between them gets worse.)

rank # matrices traditional
# mul ops, # add ops
direct sum-of-products
# mul ops, # add ops
2 2 8 muls, 4 adds (same)
2 3 16 muls, 8 adds 32 muls, 12 adds
2 4 24 muls, 12 adds 96 muls, 28 adds
2 5 32 muls, 16 adds 256 muls, 60 adds
2 6 40 muls, 20 adds 640 muls, 124 adds
2 7 48 muls, 24 adds 1536 muls, 252 adds
2 8 56 muls, 28 adds 3584 muls, 508 adds
2 9 64 muls, 32 adds 8192 muls, 1020 adds
2 $n+1$ $8n$ muls, $4n$ adds $2^{n+2}n$ muls, $4(2^{n}-1)$ adds
3 2 27 muls, 18 adds (same)
3 3 54 muls, 36 adds 162 muls, 72 adds
3 4 81 muls, 54 adds 729 muls, 234 adds
3 5 108 muls, 72 adds 2916 muls, 720 adds
3 6 135 muls, 90 adds 10935 muls, 2178 adds
3 7 162 muls, 108 adds 39366 muls, 6552 adds
3 9 216 muls, 144 adds 472392 muls, 59040 adds
3 $n+1$ $27n$ muls, $18n$ adds $3^{n+2}n$ muls, $9(3^{n}-1)$ adds
4 2 64 muls, 48 adds (same)
4 3 128 muls, 96 adds 512 muls, 240 adds
4 4 192 muls, 144 adds 3072 muls, 1008 adds
4 5 256 muls, 192 adds 16384 muls, 4080 adds
4 9 512 muls, 384 adds 8388608 muls, 1048560 adds
4 $n+1$ $64n$ muls, $48n$ adds $4^{n+2}n$ muls, $16(4^{n}-1)$ adds
5 2 125 muls, 100 adds (same)
5 3 250 muls, 200 adds 1250 muls, 600 adds
5 4 375 muls, 300 adds 9375 muls, 3100 adds
5 9 1000 muls, 800 adds 78125000 muls, 9765600 adds
5 $n+1$ $125n$ muls, $100n$ adds $5^{n+2}n$ muls, $25(5^{n}-1)$ adds
6 2 216 muls, 180 adds (same)
6 3 432 muls, 360 adds 2592 muls, 1260 adds
6 4 648 muls, 540 adds 23328 muls, 7740 adds
6 9 1728 muls, 1440 adds 483729408 muls, 60466140 adds
6 $n+1$ $216n$ muls, $180n$ adds $6^{n+2}n$ muls, $36(6^{n}-1)$ adds
7 2 343 muls, 294 adds (same)
7 3 686 muls, 588 adds 4802 muls, 2352 adds
7 9 2744 muls, 2352 adds 2259801992 muls, 282475200 adds
7 $n+1$ $343n$ muls, $294n$ adds $7^{n+2}n$ muls, $49(7^{n}-1)$ adds
8 2 512 muls, 448 adds (same)
8 3 1024 muls, 896 adds 8192 muls, 4032 adds
8 9 4096 muls, 3584 adds 8589934592 muls, 1073741760 adds
8 $n+1$ $512n$ muls, $448n$ adds $8^{n+2}n$ muls, $64(8^{n}-1)$ adds
9 2 729 muls, 648 adds (same)
9 3 1458 muls, 1296 adds 13122 muls, 6480 adds
9 9 5832 muls, 5184 adds 27894275208 muls, 3486784320 adds
9 $n+1$ $729n$ muls, $648n$ adds $9^{n+2}n$ muls, $81(9^{n}-1)$ adds
$r$ 2 $r^3$ muls, $(r-1)r^2$ adds $r^3$ muls, $r^2(r-1)$ adds
$r$ 3 $2r^3$ muls, $2(r-1)r^2$ adds $2r^4$ muls, $r^2(r^2-1)$ adds
$r$ 4 $3r^3$ muls, $3(r-1)r^2$ adds $3r^5$ muls, $r^2(r^3-1)$ adds
$r$ 5 $4r^3$ muls, $4(r-1)r^2$ adds $4r^6$ muls, $r^2(r^4-1)$ adds
$r$ 6 $5r^3$ muls, $5(r-1)r^2$ adds $5r^7$ muls, $r^2(r^5-1)$ adds
$r$ 7 $6r^3$ muls, $6(r-1)r^2$ adds $6r^8$ muls, $r^2(r^6-1)$ adds
$r$ 8 $7r^3$ muls, $7(r-1)r^2$ adds $7r^9$ muls, $r^2(r^7-1)$ adds
$r$ 9 $8r^3$ muls, $8(r-1)r^2$ adds $8r^{10}$ muls, $r^2(r^8-1)$ adds
$r$ 10 $9r^3$ muls, $9(r-1)r^2$ adds $9r^{11}$ muls, $r^2(r^9-1)$ adds
$r$ $n+1$ $nr^3$ muls, $n(r-1)r^2$ adds $nr^{n+2}$ muls, $r^2(r^n-1)$ adds

Non-square matrices scale in a similar manner, but there's no simple way to calculate how many operations are needed for $n+1$ matrices.