What is contracting a tensor actually doing?

3k Views Asked by At

I'm learning about tensors, and have a vague idea regarding what contracting a tensor means—but I'm still not sure of exactly what it's doing. Maybe someone here can put it in more intuitive terms.

Say we have a $(k,l)$ tensor, $T^k_l$. This is some abstract mathematical structure which takes $k$ dual-vectors / covectors and $l$ vectors as "arguments" and spits out a number.

Contracting this tensor should give me a $T^{k-1}_{l-1}$ tensor, right? Where (using summation convention) we have something like:

$$T^{k_1}_{l_1} + T^{k_2}_{l_2} + ... + T^{k_3}_{l_3}$$

This is the $k_1$ covector acting on the $l_1$ vector.

I guess the more I think about it, the less I really know what contracting the indices of a tensor really means.

Could anyone give me a push in the right direction?

2

There are 2 best solutions below

2
On BEST ANSWER

My favorite way to interpret the trace is as the average value of an associated quadratic form. Here's how that works.

Let $V$ be an $n$-dimensional vector space, and let $T$ be a tensor on $V$. First let's consider the case in which $T$ is a tensor of type $(1,1)$, which we can also interpret as a linear map from $V$ to itself. Choose an inner product $\left< \cdot,\cdot\right>$ on $V$, and define the associated quadratic form $Q\colon V\to\mathbb R$ by $$Q(x) = \left< x, Tx \right>.$$ Then a computation shows that the trace of $T$ is $n$ times the average value of $Q$ over the unit sphere in $V$.

(Here's a sketch of how this computation is done: Choose an orthonormal basis for $V$ and express $x$ in terms of that basis as an $n$-tuple $(x^1,\dots,x^n)$, with $(x^1)^2 + \dots + (x^n)^2 = 1$. Then $$\int_{\mathbb S^{n-1}} Q(x)\,dA = \sum_{i,j}T_i^j\int_{\mathbb S^{n-1}} x^ix^j\,dA. $$ The integrals on the right with $i\ne j$ are all zero, while the ones with $i=j$ are all the same, as can be seen by renaming the variables; adding them all up yields the volume of the sphere, so each integral is $1/n$th of the volume.)

It's interesting to note that, because the trace is independent of basis, this result doesn't depend on the inner product chosen, even though the quadratic form will change depending on the inner product.

The quadratic form may seem to capture only part of the information encoded in $T$. But note that once an inner product is chosen, there's a one-to-one correspondence between linear maps $T\colon V\to V$ and bilinear forms $B_T\colon V\times V\to\mathbb R$, given by $B_T(x,y) = \left<x,Ty\right>$. Each such bilinear form decomposes into a symmetric part and a skew-symmetric part: $B_T = B_T^{\text{sym}}+B_T^{\text{skew}}$. The trace of the skew part is zero, so the trace only "sees" the symmetric part; and the symmetric part can be reconstructed from the quadratic form by using the polarization identity $B_T(x,y) = \tfrac14(Q(x+y)-Q(x-y))$.

Now if $T$ is a tensor of type $(k,l)$, the contraction on any pair of indices yields a tensor of type $(k-1,l-1)$, whose value on any set of arguments $x_1,\dots,x_{k-1}, x_1^*,\dots,x_{l-1}^*$ is just $n$ times the average value of the quadratic form determined by the $(1,1)$-tensor $T(x_1,\dots,x_{k-1},\ \cdot\ , x_1^*,\dots,x_{l-1}^*,\ \cdot\ )$.

0
On

This also was a confusion when I learned this for a while, but I think I can add sometime worthwhile for any future readers. Since the operation of contraction is on tensors, it happens on the pointwise level, so we can reduce the question to that of linear algebra. We start at the (1,1), so that we have one vector and one covector. From this data, there's one natural way of producing a scalar, namely by evaluation. This is what the trace of a matrix is, if you write things down in coordinates. Now for a general tensor, you may come across the notion of a metric contraction, for example of a (0,2) tensor. This is just an artifact of first type changing things so that we get a (1,1) tensor, then evaluating.