scalar expressions in matrix products

Question

scalar expressions in matrix products

173 Views Asked by Bumbble Comm At 25 Apr 2026 - 7:17

I have a fundamental (and therefore probably embarrassing) question on the treatment of scalars in matrix products. I'm currently writing a small program to support me in tedious manipulations of symbolic matrix expressions. For matrix products, there is a size condition that the number of columns in the first factor has to coincide with the number of rows in the second factor. However, this condition doesn't hold for scalar matrix expressions (size $1 \times 1$). A scalar expression can appear in each position in a product (a condition that has to be handled in the code when the size condition is checked). This also leads to strange additional constraints, for example if a product should be "flattened", i.e. the brackets removed.

Assume that ${\mathbf A}$ is an $n \times n$ matrix, and ${\mathbf x}$ is an $n \times 1$ vector. The second factor in

$$\mathbf{x} ({\mathbf x}^T {\mathbf A} {\mathbf x}) $$

is a scalar. I can flatten the expression to

$$\mathbf{x} {\mathbf x}^T {\mathbf A} {\mathbf x}$$

since the size condition is still fulfilled:

$$(n \times 1) \times (1 \times n) \times (n \times n) \times (n \times 1).$$

However, if I exchange the two factors

$$({\mathbf x}^T {\mathbf A} {\mathbf x}) \mathbf{x}$$

(which is a valid expression) and flatten the product to

$${\mathbf x}^T {\mathbf A} {\mathbf x} \mathbf{x},$$

I get a violation of the size condition:

$$(1 \times n) \times (n \times n) \times (n \times {\bf 1}) \times ({\bf n} \times 1).$$

Currently I exclude scalar expressions from flattening which makes the code unwieldy and makes me wonder whether I'm missing something fundamental.

Is there a better and consistent way to treat scalar expressions in matrix products? Probably one could describe scalars as scalar matrices (diagonal matrices with identical elements), but I'm not sure if that isn't a circular argument since shifting the position of the scalar matrix in the product may require factoring out of scalars. Also the shifting leads to changes in the size of the scalar matrix, so probably that is only interesting as an explanation but not practical in the code.

Thanks for your help!

Edit for clarification in response to user1551: If a factor is a product in brackets (let's call it the "inner" product), flattening always works if the inner product is not a scalar, but it may fail if it is a scalar. This is the special case which worries me.

Edit: I meanwhile modified my flattening code. Non-scalar factors are always flattened, scalar factors are only flattened if the partial product on the left of the scalar factor has one column and the partial product on the right of the scalar factor has one row. Still, all that is cumbersome.

Original Q&A

There are 2 best solutions below

**Bumbble Comm** · Answer 1 · 2021-04-29 05:04:08

This is an attempt to answer my own question. I'd be grateful for comments.

To arrive at a unified mathematical treatment of matrix products which include scalars, one could introduce a scalar matrix for each scalar. A scalar matrix is defined as a (square) diagonal matrix with identical diagonal elements, in this case the value of the scalar factor. Here's my experimental notation:

$$\mathbf{S}_n\{a\}$$

is the scalar matrix of dimension $n$ for scalar $a$. Scalar matrices can change their position in a product, but have to adapt their dimension according to the surrounding factors.

Now my definition is that a scalar expression is only a true scalar if it is expressed as a scalar matrix. Any other expression is a matrix expression, even if it's size is $1 \times 1$. I'm aware that this is vague; are there any better ideas?

The first example in my question could be written as

$$\mathbf{x} ({\mathbf x}^T {\mathbf A} {\mathbf x}) = \mathbf{x} \mathbf{S}_1\{{\mathbf x}^T {\mathbf A} {\mathbf x}\}$$

where the size $1$ is chosen since ${\mathbf x}$ is $n \times 1$. Removing the brackets around the scalar eliminates the scalar matrix and turns the scalar expression into a matrix. In this case there is no conflict since we had $\mathbf{S}_1$ and the matrix expression is also of size $1 \times 1$.

In my second example, we get

$$({\mathbf x}^T {\mathbf A} {\mathbf x}) \mathbf{x} = \mathbf{S}_n\{{\mathbf x}^T {\mathbf A} {\mathbf x}\} \mathbf{x}.$$

Here we need $\mathbf{S}_n$ to fit the dimension of the second factor ($n \times 1$). Leaving out the scalar matrix leads to a $1 \times 1$ matrix for the first factor which is in a size conflict with the second.

I'm not sure whether there is a circular argument involved in the definition of the scalar matrix. I avoided using a unit matrix with a scalar factor since then the scalar factor would appear explicitly.

Also I don't think this helps in the treatment of scalars in matrix products when writing code; here simply the size of the surrounding factors would be checked when flattening a scalar matrix expression (removing the brackets). But do you think the treatment of scalars as scalar matrices makes the problem more clear from the mathematical perspective?

**Bumbble Comm** · Answer 2 · 2021-04-29 07:41:10

In spite of appearances $$\mathbf{x} ({\mathbf x}^T {\mathbf A} {\mathbf x})$$ and $$({\mathbf x}^T {\mathbf A} {\mathbf x}) \mathbf{x}$$ are two very different algebraic expressions. Let's write $*$ for matrix multiplication and $\cdot$ for the scalar multiplication of, say, a real number with a matrix.

The first expression is a succession of matrix multiplications. $$\mathbf{x} ({\mathbf x}^T {\mathbf A} {\mathbf x}) = \mathbf{x} * ({\mathbf x}^T * {\mathbf A} * {\mathbf x})$$ Since $*$ is associative, one can discard parentheses with no ambiguity. This is what you call flattening.

The second expression is a scalar multiple of the column matrix $\mathbf x$, where the scalar is given by the implicit identification of a $1 \times 1$ matrix with its unique entry. $$({\mathbf x}^T {\mathbf A} {\mathbf x}) \mathbf{x} = ({\mathbf x}^T * {\mathbf A} * {\mathbf x}) \cdot \mathbf{x}$$

The impossibility of flattening that expression comes from the simple fact that you cannot move parentheses any way you like here. As you point out ${\mathbf x}^T {\mathbf A} {\mathbf x} \mathbf{x}$ is not a well-defined expression by consideration of the matrices' dimensions. To emphasize, in the strictest sense, the expression $({\mathbf x}^T * {\mathbf A} * {\mathbf x}) \cdot \mathbf{x}$ does not make sense either. It does once you make the identification $[r]_{1 \times 1} \leftrightarrow r$.

From a programming standpoint (and I'll admit this has me straying from my usual cup of tea), I'm afraid I fail to see how the problem can arise in the first place. If you have a function p(A,B) that computes matrix multiplication, its two arguments and its return value should be matrices of appropriate dimensions. If you feed the function an $m \times n $ matrix with a $p \times q$ matrix with $n \neq p$, the function should throw an error since this is not defined. This way you can iterate matrix multiplication without any headache. For instance, calling p(A,p(B,C)) computes $\mathbf A * (\mathbf B * \mathbf C)$.

Addressing your remark that "[a] scalar expression can appear in each position in a product", this is correct if you are handling an actual scalar. As human mathematicians we may implicitly identify a $1 \times 1$ matrix and a scalar, but at the formal level of a computer, it can only handle each type of object and be told how to convert from one to the other when possible.

There are two aspects to handle. First, if you should need to treat a $1 \times 1$ matrix as a scalar, you need a function to make that conversion.

Second, if you are actually given as input a string of matrices and bona fide scalars, you could "normalize" the expression by gathering all scalars to the left, leaving the matrices grouped afterwards, their order untouched. In other words, this is using the fact that $rA=Ar$ for any real or complex scalar $r$ and any rectangular matrix $A$. Then the product of all the scalars is a scalar, the computed product of the matrices is a matrix: those are the two arguments to feed to a function that computes scalar multiplication.

scalar expressions in matrix products

There are 2 best solutions below

Related Questions in MATRICES

Related Questions in MATRIX-EQUATIONS

Trending Questions

Popular # Hahtags

Popular Questions