Why is it standard convention to denote dual vectors as row vectors when using coordinates?

354 Views Asked by At

I found this claim in my lecture notes on real analysis in the chapter on the gradient.

When using coordinates, the standard convention is to denote vectors as columns, and covectors as rows.

The distinction is also essential in physics, as the two kinds of vectors transform differerently under coordinate changes.

I think I understand what they mean with the second sentence. To see this let $e_1,...,e_n$ be and $e',...,e'$ be two bases for the vector space $V$. Given each basis we can find a basis for the dual vector space $V^*$, respectively. In particular,

$e^j(\alpha_1 e_1+...+\alpha_n e_n)=\alpha_j,j=1,...,n$ is a basis for $V^*$ constructed using the basis $e_1,...,e_n$ for $V$.

Similarly,

$(e^i)'(\beta_1 e_1'+...+\beta_n e_n')=\beta_i,i=1,...,n$ is a basis for $V^*$ constructed using the basis $e_1',...,e_n'$ for $V$.

Now let

\begin{equation*} P = \begin{pmatrix} p_{1,1} & p_{1,2} & \cdots & p_{1,n} \\ p_{2,1} & p_{2,2} & \cdots & p_{2,n} \\ \vdots & \vdots & \ddots & \vdots \\ p_{m,1} & p_{m,2} & \cdots & p_{m,n} \end{pmatrix} \end{equation*}

be the $n \times n$ matrix for the change of basis from $e_1,...,e_n$ to $e',...,e'$ and let $\alpha$ and $\beta$ be the column vectors of coordinates for a vector $v \in V$, then

$P \alpha = \beta$ or $\beta_i=\sum \limits_{j=1}^{n}p_{ij} \cdot \alpha_j$.

This means that

$(e^i)'(v)=\sum \limits_{j=1}^{n}p_{ij} e^j(v)$.

Now consider

$v^*=\delta_1 (e^1)'+...\delta_n (e^n)'$.

Using the relation between the two bases of $v^*$ we find

$v^*=\delta_1 \sum \limits_{j=1}^{n}p_{1j} e^j+...+\delta_n \sum \limits_{j=1}^{n}p_{nj} e^j$ which can equivivalently be written as

$v^*=(\sum \limits_{j=1}^{n} \delta_i p_{i1}) e^1+...+(\sum \limits_{j=1}^{n} \delta_i p_{in}) e^n=\gamma_1 e^1+...+\gamma_n e^n$

If we write the coordinate vectors $\gamma$ and $\delta$ for $v^* \in V^*$ as row vectors, then

$\delta P=\gamma$ or $\delta=\gamma P^{-1}$, so the change of basis matrix for the dual space is the inverse of the change of basis matrix $P$ for the vector space itself.

However, I can still not see why it is convenient to denote the coordinate vectors for elements of the dual space as row vectors. Can someone explain other reasons why this is useful? Of course, in this case we can avoid transposing the matrix P before inverting (which we would have to do if we wanted to use a column vector of coordinates for the dual space as well), but that seems like a minor thing to me. I feel like it might have something to do with inner products.

2

There are 2 best solutions below

4
On BEST ANSWER

A real $m\times n$ matrix is naturally identified with a linear map from $\Bbb R^n \to \Bbb R^m$. We identify elements of $\Bbb R^n$ as column vectors ($n \times 1$ matrices) mostly because of the longstanding notational tradition of "operator on left, argument on right": $f(x)$ not $(x)f$. Because of how matrix multiplication is defined, matrices multiply column vectors on the left and row vectors on the right.

Now a dual vector is a linear functional on $\Bbb R^m$, which is by definition a linear map from $\Bbb R^m \to \Bbb R$, which means it is naturally associated with an $1 \times m$ matrix. I.e., a row vector.

One could choose to represent $\Bbb R^n$ as row vectors instead (either redefining matrix multiplication, or just acknowledging that writing $vM$ really isn't that big of a deal). But if you do, you will find that dual vectors are naturally column vectors. It is the representation of linear maps as matrices that forces this.

0
On
  1. It is consistent with the usual notation to represent dual vectors as rows: A dual vector acts linearly on a vector to give a scalar; it "maps vectors to scalars". In the usual matrix representation, vectors are represented by columns and the linear operator is represented by a matrix on the left.

Rows of the matrix act as dual vectors: each row acts linearly on a column (vector) to give a scalar.

  1. It is also convenient: it is much easier to type a row vector than a column vector. A row vector only takes up a line (if it's not too long).