Equivalence between mathematical and physical definition of a tensor

211 Views Asked by At

I want to understand how the "physicists' definition" of a tensor and the "mathematicians' definition" are equivalent. I'm going to stick to the finite dimensional case, and note that I am not concerned with the difference between tensor fields and tensors.

Let $V$ be a finite dimensional vector space, $V^{*}$ its dual space, and $F$ the underlying field. A type $(n,m)$ tensor $T$ is a multilinear map $$T: V \times ... \times V \times V^{*} \times ... \times V^{*} \to F $$ where there are $n$ copies of $V$ and $m$ copies of $V^{*}$.

At this point after pondering this definition for a while you can see that we can represent any tensor as a linear function on a larger vector space which takes as basis vectors all combinations of the basis vector of the $n$ copies of $V$ and $m$ copies of $V^{*}.$ In other words, we should be able to represent a tensor as a (multidimensional) array of shape $d^{n+m}.$ However, the values of this array will depend on the choice of basis chosen for $V$ (and $V^{*}$ - but let's assume for now that we will always choose the ``natural'' basis for $V^{*}$ given our choice of basis for $V$.)

The fact that tensors can be represented as multidimensional arrays, but the values of the array could depend on the basis might lead you to consider a ``physicists' definition of a type $(n, m)$ tensor'':

Let $V$ be a finite dimensional vector space with dimension $d$ and $F$ the underlying field. A type $(n,m)$ tensor $T$ associated to the vector space $V$ is a multidimensional array of shape $d^{n+m}$ which obeys a certain transformation law.

At this point, we don't know what the transformation law is. Instead we want to pretend like we are discovering it. So we start by considering the simplest cases first and then working our way up in complexity. From here on forward we'll take $p=2, F=\mathbb{R}$ for simplicity, and we'll represent $v,w \in V$ as column vectors and $\phi, \psi \in V^{*}$ as row vectors.

Example 1: A type $(1,0)$ tensor $$T: V \to \mathbb{R}$$
We know that any linear map on $v$ to a scalar be represented as a row vector. So $$T(v) = \begin{bmatrix} L_1 & L_2 \end{bmatrix} \begin{bmatrix} v_1\\ v_2\\ \end{bmatrix}$$ for some $L_1, L_2 \in \mathbb{R}.$ Now what would the transformation law be? Well we can easily derive it: $$T(v) = Lv = L(R^{-1}R)v = (LR^{T})(Rv) = \hat{L} \hat{v}$$ where there was some change of basis: $$v \mapsto \hat{v}, \hat{v} = Rv, RR^{T} = I.$$

Conclusion: The transformation law is given by: $$v \mapsto Rv \implies L \mapsto LR^T.$$ (This case also tells us how the covectors transform in general.)

Example 2: A type $(0,1)$ tensor $$T: V^{*} \to \mathbb{R}$$
This case is similar. Any linear map on $\phi$ to a scalar can be represented by a column vector. So $$T(\phi) = \begin{bmatrix} \phi_1 & \phi_2 \end{bmatrix} \begin{bmatrix} L_1\\ L_2\\ \end{bmatrix}$$ for some $L_1, L_2 \in \mathbb{R}.$ $$T(\phi) = \phi L = \phi R^T R L = (\phi R^{T})(RL) = \hat{\phi} \hat{L} $$ Transformation law: $$v \mapsto Rv, \phi \mapsto \phi R^{T} \implies L \mapsto RL.$$

Now this is where I start to get confused. I want to consider all tensors which are represented by $2d$ arrays. This includes the type $(2,0), (1,1), (0,2)$ tensors. I want to represent each of these by using the normal rules for column/row vector and matrix multiplication.

Ostensibly it looks like we can only represent a type $(1,1)$ tensor: $$T(\phi, v) = \begin{bmatrix} \phi_1 & \phi_2 \end{bmatrix} \begin{bmatrix} L_{11} & L_{12} \\ L_{21} & L_{22} \end{bmatrix} \begin{bmatrix} v_1 \\ v_2 \end{bmatrix}$$ But we can actually easily represent a multilinear map from two copies of $V$ with the following: $$T(w, v) = \begin{bmatrix} w_1 \\ w_2 \end{bmatrix}^{T} \begin{bmatrix} L_{11} & L_{12} \\ L_{21} & L_{22} \end{bmatrix} \begin{bmatrix} v_1 \\ v_2 \end{bmatrix}.$$ This is a ``quadratic form.'' And if we want to represent a multilinear map from two copies of $V^{*}$ then we can just write: $$T(\phi, \psi) = \begin{bmatrix} \phi_1 & \phi_2 \end{bmatrix} \begin{bmatrix} L_{11} & L_{12} \\ L_{21} & L_{22} \end{bmatrix} \begin{bmatrix} \psi_1 & \psi_2 \end{bmatrix}^{T} $$ We also could also represent a type $(1,1)$ tensor like: $$T(w, \psi) =\begin{bmatrix} w_1 \\ w_2 \end{bmatrix}^{T} \begin{bmatrix} L_{11} & L_{12} \\ L_{21} & L_{22} \end{bmatrix} \begin{bmatrix} \psi_1 & \psi_2 \end{bmatrix}^{T}.$$

Now we derive our transformation laws. We insist that $v \in V$ transforms like $\hat{v} = Rv$ and $\phi \in V^{*}$ transforms like $\hat{\phi} = \phi R^T.$

$$\phi L v = \phi R^{T} R L R^{T} R v = (\phi R^{T}) (R L R^{T}) (R v) = \hat{\phi} \hat{L} \hat{v}$$ $$w^T L v = w^T R^{T} R L R^{T} R v = (Rw)^{T} (R L R^{T}) (R v) = \hat{w}^T \hat{L} \hat{v}$$ $$\phi L \psi^{T} = \phi R^{T} R L R^{T} R \psi^{T} = (\phi^{T} R) (R L R^{T}) (\psi R^{T})^{T} = \hat{\phi} \hat{L} \hat{\psi}^{T}$$ $$w^T L \psi^{T} = w^T R^{T} R L R^{T} R \psi^{T} = (Rw)^{T} (R L R^{T}) (\psi R^{T})^{T} = \hat{w}^T \hat{L} \hat{\psi}^{T}$$

So I am getting the same transformation law for the type $(2,0), (1,1), $ and $(0,2)$ tensors! This is clearly not right, so what is going on here?

1

There are 1 best solutions below

3
On

I will focus on the case of contravariant tensors, i.e., elements of $V^{\otimes n}$. Equivalently, these are maps $$T:V^{\times n} \rightarrow \mathbb{F}.$$ In the physicist's notation, this would be a tensor represented with $n$ indices raised, i.e., $T^{i_1\cdots i_n}$. One important thing to note is that $T^{i_1\cdots i_n}$ represents the coefficients of the underlying mathematical tensor element, i.e., a mathematician would write, after fixing some basis $\{e_i\}$ for $V$, $$T = T^{i_1\cdots i_n}\,e_{i_1}\otimes \cdots \otimes e_{i_n},$$ where we use Einstein summation notation to implicitly sum over repeated raised and lowered indices. This is really all there is to it. A physicist's tensor is simply the collection of coefficients for a mathematician's tensor, usually with some implicit choice of basis (for example, in general relativity or differential geometry in general, the basis is fixed by the choice of the underlying chart). A covariant tensor, i.e., elements of $(V^*)^{\otimes m}$ would be correspondingly represented with the dual basis and lowered indices.

Now, you want to figure out the transformation rule for these tensors. All we have to do is compare the coefficients then. Suppose we have two choices of basis for $V$, say $\{e_i\}$ and $\{f_j\}$, with the change of basis given by $P$ such that $e_i = P_i^jf_j$. Then our tensor, with components $T^{ij}$ in the $\{e_i\}$ basis, can be equivalently written in the $\{f_j\}$ basis as $$T = T^{ij}e_i \otimes e_j = T^{ij}(P^a_if_a)\otimes (P^b_jf_b) = P_i^aP_j^bT^{ij}f_a \otimes f_b.$$ Thus if we denote the tensor components in the $\{f_j\}$ basis as $\tilde{T}^{ab}$, we have the transformation law $$\tilde{T}^{ab} = P^a_iP^b_jT^{ij},$$ which is the usual transformation law as introduced by physicists, where $P$ is more commonly denoted as some kind of Jacobian when considering tensor fields.

Trying to use matrix notation for these calculations can sometimes be useful, but more often confusing. As you've noticed, traditional matrix notation is represents $(1,1)$ due to the natural isomorphism between the space of linear operators $L(V)$ and $V^*\otimes V$.

Now, you can always convert $(2,0)$ and $(0,2)$ tensors to $(1,1)$ tensors if you have some choice of isomorphism $V\cong V^*$, say an inner product, but this is dependent on the inner product you have and is often more work than its worth (especially for non-Euclidean signatures). In your running example, when you write "But we can actually easily represent a multilinear map from two copies of V with the following" you are implicitly using the standard Euclidean inner product to identify $V$ and $V^*$.

I don't know what you are trying to do with the remaining examples, but they don't really make sense to me. They don't even map to the underlying field, so they cannot be correct. In short, matrix notation is a bit misleading because there is a hidden dual isomorphism that's unaccounted for. If someone gives you an order $2$ tensor in matrix notation, they are not supplying all necessary information for you to know the transformation, because you intrinsically cannot know whether it is $(1,1)$, $(0,2)$, or $(2,0)$. The reason this is often glossed over is because many texts focus on the case of $\mathbb{R}^n$ with the usual inner product, in which case all of these cases are identified.