How is the matrix of a linear operator affected by a change of bases?

244 Views Asked by At

First, we define the matrix of a linear mapping.

Let $X$ and $Y$ be finite-dimensional vector spaces over the same field $F$; let $n = \dim X$ and $m = \dim Y$; let $E = \left( e_1, \ldots, e_n \right)$ be an ordered basis for $X$, and $B = \left( b_1, \ldots, b_m \right)$ be an ordered basis for $Y$; and let $T \colon X \to Y$ be a linear mapping.

For any element $x \in X$, there is a unique $n$-tuple $\left( \alpha_1, \ldots, \alpha_n \right)$ of scalars (i.e. elements of the field $F$) such that $$ x = \sum_{j=1}^n \alpha_j e_j. \tag{1}$$ The column vector $$ \left[x \right]_E \colon = \left[ \matrix{\alpha_1 \\ \vdots \\ \alpha_n } \right] \tag{1a} $$ is the coordinate vector of $x$ relative to the ordered basis $E$ of the vector space $X$.

Moreover, as $T(x) \in Y$, so there exists a unique ordered $m$-tuple $\left( \beta_1, \ldots, \beta_m \right)$ of scalars such that $$ T(x) = \sum_{i=1}^m \beta_i b_i. \tag{2}$$ Thus the coordinate vector of $T(x)$ relative to the ordered basis $B$ of the vector space $Y$ is given by $$ \left[ T(x) \right]_B = \left[ \matrix{\beta_1 \\ \vdots \\ \beta_m } \right]. \tag{2a} $$ As $T$ is linear, so by (1), we obtain $$ T(x) = T \left( \sum_{j=1}^n \alpha_j e_j \right) = \sum_{j=1}^n \alpha_j T \left( e_j \right). \tag{3}$$ But as $T \left( e_1 \right)$, $\ldots$, $T\left( e_n \right)$ are also elements of $Y$, so there exist unique ordered $m$-tuples $$\left( \gamma_{11}, \ldots, \gamma_{m1} \right), \ldots, \left( \gamma_{1n}, \ldots, \gamma_{mn} \right)$$ of scalars such that $$ \begin{align} T\left( e_1 \right) &= \sum_{i=1}^m \gamma_{i1} b_i, \\ \cdots &= \cdots, \\ T\left( e_n \right) &= \sum_{i=1}^m \gamma_{in} b_i. \end{align} \tag{4} $$ Thus the coordinate vectors of $T \left( e_1 \right)$, $\ldots$, $T\left( e_n \right)$ relative to the ordered basis $B$ are given by $$ \begin{align} \left[ T \left( e_1 \right) \right]_B &= \left[ \matrix{ \gamma_{11} \\ \vdots \\ \gamma_{m1} } \right], \\ \cdots &= \cdots, \\ \left[ T \left( e_n \right) \right]_B &= \left[ \matrix{ \gamma_{1n} \\ \vdots \\ \gamma_{mn} } \right]. \end{align} $$ Then the matrix of $T$ relative to the ordered basis $E$ of $X$ and the ordered basis $B$ of $Y$ is given by $$ \left[ T \right]_{EB} = \left[ \matrix{ \gamma_{11} & \ldots & \gamma_{1n} \\ \vdots & \ddots & \vdots \\ \gamma_{m1} & \ldots & \gamma_{mn} } \right].$$ Now from (3) and (4), we find that $$ \begin{align} T(x) &= \sum_{j=1}^n \alpha_j T \left( e_j \right) \\ &= \sum_{j=1}^n \left( \alpha_j \sum_{i=1}^m \gamma_{ij} b_i \right) \\ &= \sum_{j=1}^n \left( \sum_{i=1}^m \alpha_j \gamma_{ij} b_i \right) \\ &= \sum_{j=1}^n \left( \sum_{i=1}^m \gamma_{ij} \alpha_j b_i \right) \\ &= \sum_{j=1}^n \sum_{i=1}^m \left( \gamma_{ij} \alpha_j b_i \right) \\ &= \sum_{i=1}^m \sum_{j=1}^n \left( \gamma_{ij} \alpha_j b_i \right) \\ &= \sum_{i=1}^m \left( \sum_{j=1}^n \gamma_{ij} \alpha_j b_i \right) \\ &= \sum_{i=1}^m \left( \sum_{j=1}^n \gamma_{ij} \alpha_j \right) b_i. \end{align} $$ That is, $$ T(x) = \sum_{i=1}^m \left( \sum_{j=1}^n \gamma_{ij} \alpha_j \right) b_i. \tag{5} $$ So the coordinate vector of $T(x)$ relative to the basis $B$ of $Y$ is given by $$ \left[ T(x) \right]_B = \left[ \matrix{ \sum_{j=1}^n \gamma_{1j} \alpha_j \\ \vdots \\ \sum_{j=1}^n \gamma_{mj} \alpha_j } \right]. \tag{5a} $$ Since the coordinate vector of $T(x)$ relative to the basis $B$ of $Y$ is unique, therefore from (2a) and (5a) we obtain $$\left[ \matrix{\beta_1 \\ \vdots \\ \beta_m } \right] = \left[ T(x) \right]_B = \left[ \matrix{ \sum_{j=1}^n \gamma_{1j} \alpha_j \\ \vdots \\ \sum_{j=1}^n \gamma_{mj} \alpha_j } \right] = \left[ \matrix{ \gamma_{11} & \ldots & \gamma_{1n} \\ \vdots & \ddots & \vdots \\ \gamma_{m1} & \ldots & \gamma_{mn} } \right] \left[ \matrix{\alpha_1 \\ \vdots \\ \alpha_n } \right]. $$ That is, [Refer to (1a), (2a), and (5a) above.] $$ \left[ T(x) \right]_B = \left[ T \right]_{EB} \left[ x \right]_E, \tag{A} $$ where $$\left[ T \right]_{EB} \colon= \left[ \matrix{ \gamma_{11} & \ldots & \gamma_{1n} \\ \vdots & \ddots & \vdots \\ \gamma_{m1} & \ldots & \gamma_{mn} } \right] \tag{*} $$ is said to be the matrix of the linear mapping $T \colon X \to Y$ relative to the ordered basis $E$ of $X$ and the ordreed basis $B$ of $Y$.

Is my analysis correct so far? If so, then is my presentation in conformity with the usual convention in the literature?

I have also got the following question

Now let $E^\prime = \left( e_1^\prime, \ldots, e_n^\prime \right)$ be another ordered basis for the vector space $X$, and let $B^\prime = \left( b_1^\prime, \ldots, b_m^\prime \right)$ be another ordered basis for $Y$.

Let $\left[ T \right]_{E^\prime B^\prime}$ denote the matrix of $T$ relative to the ordered basis $E^\prime$ of $X$ and the ordered basis $B^\prime$ of $Y$.

Then what is the relation, if any, between $\left[ T \right]_{E^\prime B^\prime}$ and $\left[ T \right]_{EB}$?

My effort:

As $e_1^\prime, \ldots, e_n^\prime \in X$ and as $E = \left( e_1, \ldots, e_n \right)$ is an ordered basis for $X$, so we can find unique $n$-tuples $$ \left( \lambda_{11}, \ldots, \lambda_{n1} \right), \ldots, \left( \lambda_{1n}, \ldots, \lambda_{nn} \right)$$ of scalars such that $$ \begin{align} e_1^\prime &= \sum_{k=1}^n \lambda_{k1} e_k, \\ \cdots &= \cdots, \\ e_n^\prime &= \sum_{k=1}^n \lambda_{kn} e_k. \end{align} \tag{6} $$ Thus the coordinate vectors of $e_1^\prime, \ldots, e_n^\prime$ relative to the basis $E$ of $X$ are given by $$ \begin{align} \left[ e_1^\prime \right]_{E} &= \left[ \matrix{ \lambda_{11} \\ \vdots \\ \lambda_{n1} } \right], \\ \cdots &= \cdots, \\ \left[ e_n^\prime \right]_{E} &= \left[ \matrix{ \lambda_{1n} \\ \vdots \\ \lambda_{nn} } \right]. \end{align} \tag{6a} $$ Now as $x \in X$ and as $\left( e_1^\prime, \ldots, e_n^\prime \right)$ is an ordered basis for $X$, so there exists a unique ordered $n$-tuple $\left( \alpha_1^\prime, \ldots, \alpha_n^\prime \right)$ of scalars such that $$ x = \sum_{j=1}^n \alpha_j^\prime e_j^\prime. \tag{7}$$ so that the coordinate vector of $x$ relative to the ordered basis $E^\prime$ of $X$ is given by $$ \left[ x \right]_{E^\prime} = \left[ \matrix{ \alpha_1^\prime \\ \vdots \\ \alpha_n^\prime } \right]. \tag{7a} $$ Now, using (6) in (7), we obtain $$ \begin{align} x &= \sum_{j=1}^n \alpha_j^\prime e_j^\prime \\ &= \sum_{j=1}^n \left( \alpha_j^\prime \sum_{k=1}^n \lambda_{kj} e_k \right) \\ &= \sum_{j=1}^n \left( \sum_{k=1}^n \alpha_j^\prime \lambda_{kj} e_k \right) \\ &= \sum_{j=1}^n \left( \sum_{k=1}^n \lambda_{kj} \alpha_j^\prime e_k \right) \\ &= \sum_{k=1}^n \left( \sum_{j=1}^n \lambda_{kj} \alpha_j^\prime e_k \right) \\ &= \sum_{k=1}^n \left( \sum_{j=1}^n \lambda_{kj} \alpha_j^\prime \right) e_k. \end{align} $$ That is, $$ x = \sum_{k=1}^n \left( \sum_{j=1}^n \lambda_{kj} \alpha_j^\prime \right) e_k, \tag{8} $$ so that the coordinate vector of $x$ relative to the ordered basis $E$ is given by $$ \left[ x \right]_E = \left[ \matrix{ \sum_{j=1}^n \lambda_{1j} \alpha_j^\prime \\ \vdots \\ \sum_{j=1}^n \lambda_{nj} \alpha_j^\prime } \right]. \tag{8a} $$ Then from (1a) and (8a), we obtain $$ \left[ \matrix{\alpha_1 \\ \vdots \\ \alpha_n } \right] = \left[x \right]_E = \left[ \matrix{ \sum_{j=1}^n \lambda_{1j} \alpha_j^\prime \\ \vdots \\ \sum_{j=1}^n \lambda_{nj} \alpha_j^\prime } \right] = \left[ \matrix{ \lambda_{11} & \ldots & \lambda_{1n} \\ \vdots & \ddots & \vdots \\ \lambda_{n1} & \ldots & \lambda_{nn} } \right] \left[ \matrix{\alpha_1^\prime \\ \vdots \\ \alpha_n^\prime } \right]. $$ That is, $$ \left[ x \right]_E = M_{E^\prime E} \left[ x \right]_{E^\prime}, \tag{B}$$ where $$ M_{E^\prime E} \colon= \left[ \matrix{ \lambda_{11} & \ldots & \lambda_{1n} \\ \vdots & \ddots & \vdots \\ \lambda_{n1} & \ldots & \lambda_{nn} } \right] \tag{**}$$ is the change-of-basis matrix from the ordered basis $E^\prime$ to the ordered basis $E$ of $X$.

Now as $b_1, \ldots, b_m \in Y$ and as $B^\prime = \left( b_1^\prime, \ldots, b_m^\prime \right)$ is an ordered basis for $Y$, so there exist unique $m$-tuples $$\left( \mu_{11}, \ldots, \mu_{m1} \right), \ldots, \left( \mu_{1m}, \ldots, \mu_{mm} \right) $$ of scalars such that $$ \begin{align} b_1 &= \sum_{r=1}^m \mu_{r1} b_r^\prime, \\ \cdots &= \cdots, \\ b_m &= \sum_{r=1}^m \mu_{rm} b_r^\prime. \end{align} \tag{9} $$ Thus the coordinate vectors of $b_1, \ldots, b_m$ relative to the ordered basis $B^\prime$ of $Y$ are given by $$ \begin{align} \left[ b_1 \right]_{B^\prime} &= \left[ \matrix{ \mu_{11} \\ \vdots \\ \mu_{m1} } \right], \\ \cdots &= \cdots, \\ \left[ b_m \right]_{B^\prime} &= \left[ \matrix{ \mu_{1m} \\ \vdots \\ \mu_{m m} } \right]. \end{align} \tag{9a} $$ Now as $T(x) \in Y$ and as $B^\prime$ is an ordered basis for $Y$, so there exists a unique ordered $m$-tuple $\left( \beta_1^\prime, \ldots, \beta_m^\prime \right)$ of scalars such that $$ T(x) = \sum_{r=1}^m \beta_r^\prime b_r^\prime. \tag{10} $$ So the coordinate vector of $T(x)$ relative to the ordered basis $B^\prime$ of $Y$ is given by $$ \left[ T(x) \right]_{B^\prime} = \left[ \matrix{\beta_1^\prime \\ \vdots \\ \beta_m^\prime } \right]. \tag{10a} $$ Now from (2) and (9), we see that $$ \begin{align} T(x) &= \sum_{i=1}^m \beta_i b_i \\ &= \sum_{i=1}^m \left( \beta_i \sum_{r=1}^m \mu_{ri} b_r^\prime \right) \\ &= \sum_{i=1}^m \left( \sum_{r=1}^m \beta_i \mu_{ri} b_r^\prime \right) \\ &= \sum_{i=1}^m \left( \sum_{r=1}^m \mu_{ri} \beta_i b_r^\prime \right) \\ &= \sum_{r=1}^m \left( \sum_{i=1}^m \mu_{ri} \beta_i b_r^\prime \right) \\ &= \sum_{r=1}^m \left( \sum_{i=1}^m \mu_{ri} \beta_i \right) b_r^\prime. \end{align} $$ That is, $$ T(x) = \sum_{r=1}^m \left( \sum_{i=1}^m \mu_{ri} \beta_i \right) b_r^\prime. \tag{11} $$ Thus the coordinate vector of $T(x)$ relative to the ordered basis $B^\prime$ of $Y$ is given by $$ \left[ T(x) \right]_{B^\prime} = \left[ \matrix{ \sum_{i=1}^m \mu_{1i} \beta_i \\ \vdots \\ \sum_{i=1}^m \mu_{mi} \beta_i } \right] = \left[ \matrix{\mu_{11} & \ldots & \mu_{1m} \\ \vdots & \ddots & \vdots \\ \mu_{m1} & \ldots & \mu_{mm} } \right] \left[ \matrix{ \beta_1 \\ \vdots \\ \beta_m } \right] . \tag{11a} $$ From (10a) and (11a) we get $$ \left[ T(x) \right]_{B^\prime} = M_{BB^\prime} \left[ T(x) \right]_{B}, \tag{C} $$ where $$ M_{BB^\prime} \colon= \left[ \matrix{\mu_{11} & \ldots & \mu_{1m} \\ \vdots & \ddots & \vdots \\ \mu_{m1} & \ldots & \mu_{mm} } \right] \tag{***} $$ is the change-of-basis matrix from the ordered basis $B$ of $Y$ to the ordered basis $B^\prime$.

Now just as in (A) above, we can also write $$ \left[ T(x) \right]_{B^\prime} = \left[ T \right]_{E^\prime B^\prime} \left[ x \right]_{E^\prime}, \tag{D} $$ where $ \left[ T \right]_{E^\prime B^\prime} $ denotes the matrix of $T$ relative to the ordered basis $E^\prime$ of $X$ and the ordered basis $B^\prime$ of $Y$.

Now using (B) and (C) in (D), we get $$ M_{BB^\prime} \left[ T(x) \right]_{B} = \left[ T \right]_{E^\prime B^\prime} M_{E^\prime E}^{-1} \left[ x \right]_E, $$ provided $M_{E^\prime E}^{-1}$ exists.

Now intuitively I know that this inverse exists as all the ordered bases are linearly independent. Is it really so? If so, then how to prove this fact rigorously?

Now using (A) in the last equation, we obtain $$ M_{BB^\prime} \left[ T \right]_{EB} \left[ x \right]_E = \left[ T \right]_{E^\prime B^\prime} M_{E^\prime E}^{-1} \left[ x \right]_E,$$ which implies that $$ \left[ T \right]_{EB} \left[ x \right]_E = M_{BB^\prime}^{-1} \left[ T \right]_{E^\prime B^\prime} M_{E^\prime E}^{-1} \left[ x \right]_E, $$ provided of course that $M_{BB^\prime}^{-1}$ exists, which it does I think.

Now from the last equation we obtain $$ \left( \left[ T \right]_{EB} - M_{BB^\prime}^{-1} \left[ T \right]_{E^\prime B^\prime} M_{E^\prime E}^{-1} \right) \left[ x \right]_E = \mathbf{0}_{K^m} $$ for all points $x \in X$, where $\mathbf{0}_{K^m}$ denotes the zero vector in $K^m$.

In the last equation, we can successively put $e_1$, $\ldots$, $e_n$ in place of $x$, and thus obtain $$ \left[ T \right]_{EB} = M_{BB^\prime}^{-1} \left[ T \right]_{E^\prime B^\prime} M_{E^\prime E}^{-1}, $$ which implies that $$ \left[ T \right]_{E^\prime B^\prime} = M_{BB^\prime} \left[ T \right]_{EB} M_{E^\prime E}. \tag{E} $$

Is (E) the required relation?

Is my calculation correct so far? If so, have I been able to conform to the standard notation? Have I managed to remain consistent enough in the use of symbols? If so, then how to proceed from here?

If not, then where have I erred? where have I failed to maintain consistency?

2

There are 2 best solutions below

3
On BEST ANSWER

The original post is quite ok until you arrived at equation (D). From there I prefer another way and before I show you this way, I will repeat the important equations:

\begin{align} [T(x)]_B&=[T]_{EB}\cdot [x]_E\tag{A}\\ [x]_E&=M_{E'E}\cdot [x]_{E'}\tag{B}\\ [T(x)]_{B'}&=M_{BB'}\cdot [T(x)]_B\tag{C}\\ [T(x)]_{B'}&=[T]_{E'B'}\cdot [x]_{E'}\tag{D} \end{align}

You asked what the relation between $[T]_{EB}$ and $[T]_{E'B'}$ is. Here is my suggestion:

Start with equation $(\mathrm{C})$ and insert equation $(\mathrm{A})$ first and equation $(\mathrm{B})$ second to get \begin{align} [T(x)]_{B'}&=M_{BB'}\cdot [T(x)]_B\\ &\!\stackrel{(\mathrm{A})}{=}M_{BB'}\cdot [T]_{EB}\cdot [x]_E\\ &\!\overset{(\mathrm{B})}{=}M_{BB'}\cdot [T]_{EB}\cdot M_{E'E}\cdot [x]_{E'} \end{align} The result is $$[T(x)]_{B'}=M_{BB'}\cdot [T]_{EB}\cdot M_{E'E}\cdot [x]_{E'}$$ Now compare this equation with equation $(\mathrm{D})$ and you will get the final result $$[T]_{E'B'}=M_{BB'}\cdot [T]_{EB}\cdot M_{E'E}$$ which is the same as equation $(\mathrm{E})$ in your calculation. But in my calculation you don't have to prove that $M_{E'E}^{-1}$ exists. I say carefully that this is the better way. What is your opinion?

4
On

If you have two bases $E$ and $E'$. Make a transformation from the $E$ to $E'$ and back to $E$. This change of basis is represented by the identity matrix $I$. Is $M_{E,E'}$ the change of basis matrix from $E$ to $E'$ and $T_{E',E}$ the change of basis matrix from $E'$ to $E$, then we have $I=T_{E',E}M_{E,E'}$, so $M$ and $T$ are invertible and $T=M^{-1}$