Why is the 'change-of-basis matrix' called such?

4.3k Views Asked by At

"Let $P$ be the change-of-basis matrix from a basis $S$ to a basis $S'$ in a vector space $V$. Then, for any vector $v \in V$, we have $$P[v]_{S'}=[v]_{S} \text{ and hence, } P^{-1}[v]_{S} = [v]_{S'}$$

Namely, if we multiply the coordinates of $v$ in the original basis $S$ by $P^{-1}$, we get the coordinates of $v$ in the new basis $S'$." - Schaum's Outlines: Linear Algebra. 4th Ed.

I am having a lot of difficulty keeping these matrices straight. Could someone please help me understand the reasoning behind (what appears to me as) the counter-intuitive naming of $P$ as the change of basis matrix from $S$ to $S'$? It seems like $P^{-1}$ is the matrix which actually changes a coordinate vector in terms of the 'old' basis $S$ to a coordinate vector in terms of the 'new' basis $S'$...

Added:

"Consider a basis $S = \{u_1,u_2,...,u_n\}$ of a vector space $V$ over a field $K$. For any vector $v\in V$, suppose $v = a_1u_1 +a_2u_2+...+a_nu_n$

Then the coordinate vector of $v$ relative to the basis $S$, which we assume to be a column vector (unless otherwise stated or implied), is denoted and defined by $[v]_S = [a_1,a_2,...,a_n]^{T}$. "

"Let $S = \{ u_1,u_2,...,u_n\}$ be a basis of a vector space $V$, and let $S'=\{v_1,v_2,...,v_n\}$ be another basis. (For reference, we will call $S$ the 'old' basis and $S'$ the 'new' basis.) Because $S$ is a basis, each vector in the 'new' basis $S'$ can be written uniquely as a linear combination of the vectors in S; say,

$\begin{array}{c} v_1 = a_{11}u_1 + a_{12}u_2 + \cdots +a_{1n}u_n \\ v_2 = a_{21}u_1 + a_{22}u_2 + \cdots +a_{2n}u_n \\ \cdots \cdots \cdots \\ v_n = a_{n1}u_1 + a_{n2}u_2 + \cdots +a_{nn}u_n \end{array}$

Let $P$ be the transpose of the above matrix of coefficients; that is, let $P = [p_{ij}]$, where $p_{ij} = a_{ij}$. Then $P$ is called the \textit{change-of-basis matrix} from the 'old' basis $S$ to the 'new' basis $S'$." - Schaum's Outline: Linear Algebra 4th Ed.

I am trying to understand the above definitions with this example:

Basis vectors of $\mathbb{R}^{2}: S= \{u_1,u_2\}=\{(1,-2),(3,-4)\}$ and $S' = \{v_1,v_2\}= \{(1,3), (3,8)\}$ the change of basis matrix from $S$ to $S'$ is $P = \left( \begin{array}{cc} -\frac{13}{2} & -18 \\ \frac{5}{2} & 7 \end{array} \right)$.

My current understanding is the following: normally vectors such as $u_1, u_2$ are written under the assumption of the usual basis that is $u_1 = (1,-2) = e_1 - 2e_2 = [u_1]_E$. So actually $[u_1]_S = (1,0)$ and I guess this would be true in general... But I am not really understanding what effect if any $P$ is supposed to have on the basis vectors themselves (I think I understand the effect on the coordinates relative to a basis). I guess I could calculate a matrix $P'$ which has the effect $P'u_1, P'u_2,...,P'u_n = v_1, v_2,..., v_n$ but would this be anything?

2

There are 2 best solutions below

6
On BEST ANSWER

The situation here is closely related to the following situation: say you have some real function $f(x)$ and you want to shift its graph to the right by a positive constant $a$. Then the correct thing to do to the function is to shift $x$ over to the left; that is, the new function is $f(x - a)$. In essence you have shifted the graph to the right by shifting the coordinate axes to the left.

In this situation, if you have a vector $v$ expressed in some basis $e_1, ... e_n$, and you want to express it in a new basis $Pe_1, .... Pe_n$ (this is why $P$ is called the change of basis matrix), then you multiply the numerical vector $v$ by $P^{-1}$ in order to do this. You should carefully work through some numerical examples to convince yourself that this is correct. Consider, for example, the simple case that $P$ is multiplication by a scalar.

The lesson here is that one must carefully distinguish between vectors and the components used to express a vector in a particular basis. Vectors transform covariantly, but their components transform contravariantly.

0
On

Everybody studying the change of basis affair should work out some simple examples like the following. Consider this basis in $\mathbb{R}^2$:

$$ v_1 = (1,1) \qquad \text{and} \qquad v_2 = (1,-1) \ . $$

Or, since we are going to stress the bases and coordinates thing, we could write it this way

$$ v_1 = (1,1)_e \qquad \text{and} \qquad v_2 = (1,-1)_e \ , $$

since these are coordinates in the standard basis

$$ e_1 = (1,0) \qquad \text{and} \qquad e_2 = (0,1) \ . $$

The change of basis matrix from $v$ to $e$ is

$$ P = \begin{pmatrix} 1 & 1 \\\ 1 & -1 \end{pmatrix} \ . $$

Now, take the vector

$$ u = 2v_1 - 3v_2 \ . $$

Its coordinates in the $v$ basis are:

$$ u = (2,-3)_v \ . $$

If you want to obtain its coordinates in the $e$ (standard) basis, you can do it by hand:

$$ u = 2v_1 - 3v_2 = 2(1,1)_e -3(1,-1)_e = (2-3, 2+3)_e = (-1, 5)_e \ . $$

Now, you realise that these are exactly the same operations that you do when performing this matrix multiplication:

$$ P \begin{pmatrix} 2 \\\ -3 \end{pmatrix} = \begin{pmatrix} 1 & 1 \\\ 1 & -1 \end{pmatrix} \begin{pmatrix} 2 \\\ -3 \end{pmatrix} = \begin{pmatrix} 2 - 3 \\\ 2 + 3 \end{pmatrix} = \begin{pmatrix} -1 \\\ 5 \end{pmatrix} \ . $$

Exercise. Maybe now you could redo yourself the proof of the change of basis theorem: take two arbitrary bases $v$ and $e$ in no matter which vector space, related by

$$ v_i = a^1_i e_1 + \cdots + a^n_i e_n \ , \qquad i = 1, \dots , n \ . $$

Write down the change of basis matrix from $v$ to $e$ (that is, put the coordinates of the $v$ vectors as columns, like in the previous example):

$$ P = \begin{pmatrix} a^1_1 & \dots & a^1_n \\\ \vdots & \ddots & \vdots \\\ a^n_1 & \dots & a^n_n \end{pmatrix} \ , $$

take any vector

$$ u = b^1v_1 + \cdots + b^nv_n \ , $$

and write down its coordinates in the $v$ basis. Finally, find out its coordinates in the $e$ basis (by hand and with the help of the matrix $P$).