Associvity and Distributive property of Matrix Multiplication

1.7k Views Asked by At

Matrix multiplication is associative $(AB)C=A(BC)$

What about the case when $AB$ results in a scalar? Considering the case when $A$ is $1 \times n$ dimensional, $B$ and $C$ are both $n \times 1$ dimensional, the product $AB$ would result in a scalar which can be multiplied by $C$, i.e., $(AB)C=kC$.

However, $BC$ can not multiplied. Hence $A(BC)$ is not possible... I must be missing something here!

I came across this when checking gradient of least squares function... which resulted in $(y - W'\phi(x))\cdot\phi(x)$ where $y$ is scalar, $\phi(x)$ and $W$ are both $n$ dimensional column vector. By distributive property, multiplying resulted in the multiplication $W'\cdot\phi(x)\cdot\phi(x)$

$W' = \operatorname{tranpose}(W)$

2

There are 2 best solutions below

5
On BEST ANSWER

You are right that, formally speaking, we cannot multiply a $1\times1$ matrix with a matrix of any size. However, we have the following identifications, which respect the linear structure:

$$ \{ \text{Scalars} \} \leftrightarrow \{ 1 \times 1 \text{ matrices} \} \leftrightarrow \{ \text{Scalar matrices of any size} \}.$$

Thus, under this identification, it makes sense to multiply any matrix $C$ by a scalar $\lambda$: if $C$ has size $m \times n$, you can view $\lambda$ as the $m \times m$ scalar matrix $$\begin{pmatrix} \lambda & & &\\ & \lambda & &\\ & & \ddots & \\ & & & \lambda\end{pmatrix} = \lambda I_m$$ and the product $\lambda C$ corresponds to $(\lambda I_m) C$, which is a well-defined matrix product.


In your particular problem, you are considering $(y-W'\phi)\cdot \phi$, where $\phi$ is $n \times 1$. According to what I said above, if you want to interpret the scalar multiplication "$\cdot$" as matrix multiplication, you need to view $y$ and $W'\phi$ as $n \times n$ scalar matrices. But really in either case all that you are doing is multiplying all the entries of $\phi$ by the scalar $(y - W'\phi)$. I think you are confused because you think the operation between $W'$ and $\phi$ is the same as the operation $\cdot$ between $W'\phi$ and $\phi$. But the first one is matrix multiplication, whereas the second one is scalar-matrix multiplication.

0
On

This is more of a comment, not a direct answer to your question:

You should understand three key facts (which maybe you do already, I can't tell):

  1. Composing functions is associative.
  2. Matrices correspond to linear functions between vector spaces.
  3. Composing linear functions corresponds to multiplying matrices.

It will follow from these facts that matrix multiplication is "obviously" associative.

You may like to try to prove that above facts. For facts 2 and 3, the key concept is that linear maps are determined by the behavior on a basis, and this behavior is then encoded into a matrix (the columns of the matrix are the coordinates of the image of basis vectors). You can work out the algebra describing this.

For fact one, you have to think about what a function does to elements of a set.