Intuitive transition from matrices to tensor-concept

340 Views Asked by At

I would like to know how to build intuition for the concept of a tensor using the following reasoning:

If I conceive of a vector as an extension of the scalar concept, i.e. an $N \times 1$ "array of scalars" and similarly a matrix is a "vector of vectors", how can I apply a similar intuition to the idea of a tensor?

Is a scalar a matrix of matrices? A vector of matrices? Or something altogether different?

3

There are 3 best solutions below

0
On

Formally, a tensor is a multilinear, real-valued function of vectors; this, I find, is the best guide to intuition. For it means that once you can identify some collection of objects as having the properties of a vector space, then, to find a tensor, all you need to do is identify linear operations that can be performed on these vectors, where these operations are real-valued.

For instance, if the set of real numbers is considered as a vector space (you can verify this by checking that the set of real numbers obey the properties of a vector-space with respect to itself as its underlying field), so that each real number is treated as a vector, then the multiplication of real numbers is a billinear, real-valued, operation over the reals: a tensor!

And, if you consider the set of all $N\times1$ column vectors as a vector space over the field of real-numbers, then the multiplication of one column vector by the transpose of another column vector to give a scalar is yet another billinear tensor (this is deeply intertwined with the so-called dot product of vectors).

In fact, when you consider quite natural definitions of common operations, like calculating volumes or lengths, tensors tend to "pop up" all over the place.

Now tensors are interesting, but of little practical use if you cannot compute things with them. It turns out that any tensor you define over a finite dimensional vector-space has a unique representation as an appropriate "matrix". The way to obtain this representation is to allow the tensor operate on basis vectors. This is why it is quite usual in applications to refer to Matrices as tensors.

0
On

Let's make a more detailed analysis of what we mean by vector, matrix and tensor.

To keep things simple, let's say we are working with vector spaces over $\mathbb R$, and with finite-dimensional vector spaces.

A vector is an element of a vector space. Note that a vector space $V$ does not need to be "a tuple $(x_1,x_2,\dots,x_n)$ where $x_i \in \mathbb R$". To be more precise, we can, given a basis $v_1, \dots, v_n$ for $V$, we can pair up each $v \in V$ uniquely to an $n$-tuple $(x_1, \dots, x_n) \in \mathbb R^n$ using the basis: $$v = x_1v_1 + \dots + x_nv_n = \sum_{1 \leq i \leq n} x_iv_i.$$

A matrix, in the way I think you use the word, is a 2-dimensional array of numbers. If we have two vector spaces $V$ and $W$, with dimensions $n$ and $m$, respectively, we can pair up an element of the tensor space $V \otimes W$ with an $n \times m$ matrix.

I will explain how this pairing is done, but first maybe I should explain what $V \otimes W$ means. $V \otimes W$ can be interpreted in many different ways, but I will here take what I consider to be a quite concrete approach. For vectors $v \in V, w \in W$, we can define a mapping: $$(v,w) \mapsto v \otimes w \in V \otimes W$$ and this is a mapping $V \times W \to V \otimes W$.

We now define how we add tensors and how we multiply scalars with a tensor. We say that these two operations must satisfy: $$(v_1 + v_2) \otimes w = v_1 \otimes w + v_2 \otimes w$$ $$v \otimes (w_1 + w_2) = v \otimes w_1 + v \otimes w_2$$ and, for $c \in \mathbb R$, we have: $$c(v \otimes w) = (cv) \otimes w = v \otimes (cw).$$

With these special rules, we get a special structure for the space $V \otimes W$, namely that of a vector space (yes, you read that right, tensor spaces are vector spaces).

If you want to, you can read the construction above as a forming a quotient space of the free vector space generated by $V \times W$, where the equivalence classes are the given by the rules above.

Note that not all tensors can be written on the form $v \otimes w$. Those that can are called pure tensors or rank one-tensors (the word rank is used to describe quite different things when it comes to tensors, so watch out).

Now, you might ask, what does a basis of a tensor space look like? That's good, because that is what I will explain next. If $v_1, \dots, v_n$ is a basis for $V$ and $w_1, \dots, w_m$ is a basis for $W$, then a basis for $V \otimes W$ is given by the pure tensors $$v_i \otimes w_j ~~~~~~~~~~ 1\leq i \leq n, 1 \leq j \leq m$$ since there are $nm$ tensors of this form, the space $V \otimes W$ is an $nm$-dimensional vector space.

What this means is that any $m \in V \otimes W$ can be written uniquely on the form: $$m = \sum_{1 \leq i \leq n} \sum_{1 \leq j \leq m} x_{ij} (v_i \otimes w_j)$$ and there we have our pairing up with matrices; we can put our $x_{ij}$ in an $n \times m$ matrix.

Now, to the real intent of your questions; what about tensors of higher order? Let's say we have three vector spaces, $V$, $W$ and $U$, with dimensions $n$, $m$ and $l$, respectively and bases $v_i$, $w_i$ and $u_i$ of these. Then it might come to no surprise that we can express any element $t \in V \otimes W \otimes U$ as: $$t = \sum_{1 \leq i \leq n} \sum_{1 \leq j \leq m} \sum_{1 \leq k \leq l} x_{ijk} (v_i \otimes w_j \otimes u_k)$$ and so we can pair the tensor up with a 3-dimensional $n \times m \times l$ array, or if you want, a "vector of matrices" or "matrix of vectors".

Of course, when dealing with matrices and vectors, you do matrix-vector multiplication to get a new vector. There is no natural way to multiply a third order tensor with something and get the same type of thing out. I.e., you can't multiply a tensor with a vector and get a vector out, but you can multiply a tensor with a matrix and get a vector out, or vice versa (but I have never seen this actually used for anything; if a reader has, please let me know!).

We could build fourth order tensors (e.g. elements in $V \otimes W \otimes U \otimes X$) and multiply them with matrices to get other matrices out, that might be interesting.

3
On

Tensors have "ranks". A rank 0 tensor is just a number, a rank 1 tensor is a vector, a rank 2 tensor is a matrix (vector of vectors), a rank 3 tensor is a vector of vectors of vectors, and so on.