Matrix Multiply and Finding Scalar Multiplication is need with a fast method?

1.2k Views Asked by At

I want to find the optimal scalar multiply for following matrix:

Answer is $405$. it means this is not homework !

I see a nice link Here wrote "For the example below, there are four sides: A, B, C and the final result ABC. A is a 10×30 matrix, B is a 30×5 matrix, C is a 5×60 matrix, and the final result is a 10×60 matrix. The regular polygon for this example is a 4-gon, i.e. a square..."

My matrix is: $A(5*10), B(10*3), C(3*12), D(12*5)$

is there any way to without lot's of calculation on MM with Dynamic Programming and using above method find there is $405$ scalar matrix is here for Multiplying this matrix in efficient way?

I talked about polygon method here...or any other method that can answer this question quickly? how many scalar multiplication is need efficiently for multiplying this matrix?

1

There are 1 best solutions below

2
On

When you multiply two matrices, $\mathbf{A}$ with $r$ rows and $n$ columns, and $\mathbf{B}$ with $n$ rows and $c$ columns, the result is matrix $\mathbf{C}$ with $r$ rows and $c$ columns, via $$C_{i j} = \sum_{k=1}^{n} A_{i k} B_{k j} \quad \text{ for } \quad i = 1 \dots r, \quad j = 1 \dots c$$ which involves $r n c$ scalar multiplications.

When you multiply more than two matrices, you can choose the order freely, but it can affect how many scalar multiplications you need to do overall.

If you have three matrices, you can do either multiplication first, so there are $1 \cdot 2 = 2$ possible orders of doing the multiplications. If you have four matrices, you have $1 \cdot 2 \cdot 3 = 6$ possible orders of doing the multiplications. (See Combinatorics for details as to why this is.) Simply put, if you have $N$ matrices to multiply, you have $(N-1)!$ possible orders of multiplying the matrices.

Since there are only four matrices here, there are only six possible ways/orders of doing the multiplications, and we can explore them all.

We have: $$\begin{array}{c|c|c} \text{Matrix} & \text{Rows} & \text{Columns} \\ \hline \mathbf{A} & 5 & 10 \\ \mathbf{B} & 10 & 3 \\ \mathbf{C} & 3 & 12 \\ \mathbf{D} & 12 & 5 \\ \end{array}$$

  1. $(((\mathbf{A}\mathbf{B})\mathbf{C})\mathbf{D}$:
    $\mathbf{A}\mathbf{B}$: $5 \cdot 10 \cdot 3 = 150$ multiplications, yields a $5$-row, $3$-column matrix
    $(\mathbf{A}\mathbf{B})\mathbf{C}$: $5 \cdot 3 \cdot 12 = 180$ multiplications, yields a $5$-row, $12$-column matrix
    $((\mathbf{A}\mathbf{B})\mathbf{C})\mathbf{D}$: $5 \cdot 12 \cdot 5 = 300$ multiplications, yields a $5$-row, $5$-column matrix
    Total $150+180+300 = 630$ scalar multiplications using this order

  2. $(\mathbf{A}\mathbf{B})(\mathbf{C}\mathbf{D})$:
    $\mathbf{A}\mathbf{B}$: $5 \cdot 10 \cdot 3 = 150$ multiplications, yields a $5$-row, $3$-column matrix
    $\mathbf{C}\mathbf{D}$: $3 \cdot 12 \cdot 5 = 180$ multiplications, yields a $3$-row, $5$-column matrix
    $(\mathbf{A}\mathbf{B})(\mathbf{C}\mathbf{D})$: $5 \cdot 3 \cdot 5 = 75$ multiplications, yields a $5$-row, $5$-column matrix
    Total $150 + 180 + 75 = 405$ scalar multiplications using this order

  3. $(\mathbf{A}(\mathbf{B}\mathbf{C}))\mathbf{D}$:
    $\mathbf{B}\mathbf{C}$: $10 \cdot 3 \cdot 12 = 360$ multiplications, yields a $10$-row, $12$-column matrix
    $\mathbf{A}(\mathbf{B}\mathbf{C})$: $5 \cdot 10 \cdot 12 = 600$ multiplications, yields a $5$-row, $12$-column matrix
    $(\mathbf{A}(\mathbf{B}\mathbf{C}))\mathbf{D}$: $5 \cdot 12 \cdot 5 = 300$ multiplications, yields a $5$-row, $5$-column matrix
    Total $360 + 600 + 300 = 1260$ scalar multiplications using this order

  4. $\mathbf{A}((\mathbf{B}\mathbf{C})\mathbf{D})$:
    $\mathbf{B}\mathbf{C}$: $10 \cdot 3 \cdot 12 = 360$ multiplications, yields a $10$-row, $12$-column matrix
    $(\mathbf{B}\mathbf{C})\mathbf{D}$: $10 \cdot 12 \cdot 5 = 600$ multiplications, yields a $10$-row, $5$-column matrix
    $\mathbf{A}((\mathbf{B}\mathbf{C})\mathbf{D})$: $5 \cdot 10 \cdot 5 = 250$ multiplications, yields a $5$-row, $5$-column matrix
    Total $360 + 600 + 250 = 1210$ scalar multiplications using this order

  5. Same as 2., but with $\mathbf{C}\mathbf{D}$ first; that does does not change the number of scalar multiplications needed

  6. $\mathbf{A}(\mathbf{B}(\mathbf{C}\mathbf{D}))$:
    $\mathbf{C}\mathbf{D}$: $3 \cdot 12 \cdot 5 = 180$ multiplications, yields a $3$-row, $5$-column matrix
    $\mathbf{B}(\mathbf{C}\mathbf{D})$: $10 \cdot 3 \cdot 5 = 150$ multiplications, yields a $10$-row, $5$-column matrix
    $\mathbf{A}(\mathbf{B}(\mathbf{C}\mathbf{D}))$: $5 \cdot 10 \cdot 5 = 250$ multiplications, yields a $5$-row, $5$-column matrix
    Total $180 + 150 + 250 = 580$ scalar multiplications using this order

Therefore, we can calculate the product using $405$, $580$, $630$, $1210$, or $1260$ multiplications, without affecting the result (assuming no loss of precision in a multiplication operation – this is not exactly true when using e.g. floating-point numbers, as then some small variation due to limited precision may occur).

The simple approach to find the minimum number of multiplications to calculate the matrix chain product is to calculate the number of operations for all possible multiplication orders as above. For $N$ matrices there are $(N-1)!$ possible orders, and examining each order requires $2(N-1)$ multiplications, so this brute force optimum finding operation itself requires a total of $2(N!)-2(N-1)!$ multiplications. As a sequence of increasing $N$ starting at 3, this is $8$, $36$, $192$, $1200$, $8640$, $70560$, $645120$, $6531840$, $72576000$, $878169600$, $11496038400$, and so on; meaning that if you have say eleven or twelve or more matrices to multiply, even the brute force search to find the optimum order to calculate the chain product efficiently takes too long!

As you can see from the above list, many of those multiplications are the same, and unnecessarily repeated. The first dynamic programming option is memoization, or remembering the already calculated products. See e.g. the Wikipedia Matrix chain multiplication, which says this brings the time complexity down to $O(N^3)$, which is already a significant improvement.

Much better improvement, at $O(N \log N)$ time complexity, is the polygon-based approach published by Hu and Shing in 1982 (Part I, PDF) and 1984 (Part II, PDF). If you consider the full matrix chain product a polygon, each matrix-matrix multiplication has similar rules to ear clipping in polygon triangulation.

It turns out that we can use a simple polygon with the chain product on one side (called base) and each matrix multiplicand as the other sides, to represent the matrix chain product operation in terms of scalar multiplication operations needed. Each triangulation of that polygon represents a different way of calculating the matrix chain product. Each triangle represents one matrix-matrix product, with each vertex denoting the number of rows or columns. Then, the product of the three vertices for each triangle is equal to the number of scalar multiplication operations needed to calculate that matrix-matrix product.

Hu and Shing discovered a way of finding the triangulation (partitioning) of such a polygon that minimises the sums of those products – and therefore represents the order that calculates the matrix chain product with the fewest number of scalar multiplication operations – in $O(N \log N)$ time! This is, simply put, amazing improvement; almost unbelievable, unless you realize they found a more efficient method of describing the problem, and from there, a way to find the optimum solution very efficiently. Ingenious, really.

Therefore, when calculating the product of many matrices – say dozen or more – of different sizes, using the Hu & Shing algorithm to find the optimum order of performing the matrix-matrix multiplications, is very important for efficiency.

I am not familiar enough with the two articles, so I cannot help with implementing those articles as example code right now. They seem to be using similar geometric reasoning I've seen in other articles regarding triangulation, so perhaps other members here more familiar with triangulation and the above-linked Hu & Shing articles can be of help.