I have a fundamental (and therefore probably embarrassing) question on the treatment of scalars in matrix products. I'm currently writing a small program to support me in tedious manipulations of symbolic matrix expressions. For matrix products, there is a size condition that the number of columns in the first factor has to coincide with the number of rows in the second factor. However, this condition doesn't hold for scalar matrix expressions (size $1 \times 1$). A scalar expression can appear in each position in a product (a condition that has to be handled in the code when the size condition is checked). This also leads to strange additional constraints, for example if a product should be "flattened", i.e. the brackets removed.
Assume that ${\mathbf A}$ is an $n \times n$ matrix, and ${\mathbf x}$ is an $n \times 1$ vector. The second factor in
$$\mathbf{x} ({\mathbf x}^T {\mathbf A} {\mathbf x}) $$
is a scalar. I can flatten the expression to
$$\mathbf{x} {\mathbf x}^T {\mathbf A} {\mathbf x}$$
since the size condition is still fulfilled:
$$(n \times 1) \times (1 \times n) \times (n \times n) \times (n \times 1).$$
However, if I exchange the two factors
$$({\mathbf x}^T {\mathbf A} {\mathbf x}) \mathbf{x}$$
(which is a valid expression) and flatten the product to
$${\mathbf x}^T {\mathbf A} {\mathbf x} \mathbf{x},$$
I get a violation of the size condition:
$$(1 \times n) \times (n \times n) \times (n \times {\bf 1}) \times ({\bf n} \times 1).$$
Currently I exclude scalar expressions from flattening which makes the code unwieldy and makes me wonder whether I'm missing something fundamental.
Is there a better and consistent way to treat scalar expressions in matrix products? Probably one could describe scalars as scalar matrices (diagonal matrices with identical elements), but I'm not sure if that isn't a circular argument since shifting the position of the scalar matrix in the product may require factoring out of scalars. Also the shifting leads to changes in the size of the scalar matrix, so probably that is only interesting as an explanation but not practical in the code.
Thanks for your help!
Edit for clarification in response to user1551: If a factor is a product in brackets (let's call it the "inner" product), flattening always works if the inner product is not a scalar, but it may fail if it is a scalar. This is the special case which worries me.
Edit: I meanwhile modified my flattening code. Non-scalar factors are always flattened, scalar factors are only flattened if the partial product on the left of the scalar factor has one column and the partial product on the right of the scalar factor has one row. Still, all that is cumbersome.
This is an attempt to answer my own question. I'd be grateful for comments.
To arrive at a unified mathematical treatment of matrix products which include scalars, one could introduce a scalar matrix for each scalar. A scalar matrix is defined as a (square) diagonal matrix with identical diagonal elements, in this case the value of the scalar factor. Here's my experimental notation:
$$\mathbf{S}_n\{a\}$$
is the scalar matrix of dimension $n$ for scalar $a$. Scalar matrices can change their position in a product, but have to adapt their dimension according to the surrounding factors.
Now my definition is that a scalar expression is only a true scalar if it is expressed as a scalar matrix. Any other expression is a matrix expression, even if it's size is $1 \times 1$. I'm aware that this is vague; are there any better ideas?
The first example in my question could be written as
$$\mathbf{x} ({\mathbf x}^T {\mathbf A} {\mathbf x}) = \mathbf{x} \mathbf{S}_1\{{\mathbf x}^T {\mathbf A} {\mathbf x}\}$$
where the size $1$ is chosen since ${\mathbf x}$ is $n \times 1$. Removing the brackets around the scalar eliminates the scalar matrix and turns the scalar expression into a matrix. In this case there is no conflict since we had $\mathbf{S}_1$ and the matrix expression is also of size $1 \times 1$.
In my second example, we get
$$({\mathbf x}^T {\mathbf A} {\mathbf x}) \mathbf{x} = \mathbf{S}_n\{{\mathbf x}^T {\mathbf A} {\mathbf x}\} \mathbf{x}.$$
Here we need $\mathbf{S}_n$ to fit the dimension of the second factor ($n \times 1$). Leaving out the scalar matrix leads to a $1 \times 1$ matrix for the first factor which is in a size conflict with the second.
I'm not sure whether there is a circular argument involved in the definition of the scalar matrix. I avoided using a unit matrix with a scalar factor since then the scalar factor would appear explicitly.
Also I don't think this helps in the treatment of scalars in matrix products when writing code; here simply the size of the surrounding factors would be checked when flattening a scalar matrix expression (removing the brackets). But do you think the treatment of scalars as scalar matrices makes the problem more clear from the mathematical perspective?