PMA Rudin: understanding the definition of matrices, definition 9.9

130 Views Asked by At

I have background of linear algebra but am still confused about the definition.

Suppose $\{\mathbf{x_1}, \cdots, \mathbf{x_n}\}$ and $\{\mathbf{y_1}, \cdots, \mathbf{y_m}\}$ are bases of vector spaces $X$ and $Y$, respectively. Then every $A \in L(X, Y)$ determines a set of numbers $a_{ij}$ such that

(3) $A \mathbf{x_j}=\sum_{i=1}^m a_{ij}\mathbf{y_i}$ $(1\leq j \leq n)$.

It is convenient to visualize these numbers in a rectangular array of m rows and n columns, called an m by n matrix: $[A]=\begin{bmatrix}a_{11} & a_{12}& \cdots & a_{1n} \\ a_{21} & a_{22}& \cdots & a_{2n} \\ \cdots \\ a_{m1} & a_{m2}& \cdots & a_{mn} \end{bmatrix}$

I have a counter-example against eq(3): $\mathbf{A}=\mathbf{I}$, $\mathbf{x_i}=\mathbf{e_i}$, $\mathbf{y_i}=\mathbf{-e_i}$.

My question is how shall I understand the definition?

4

There are 4 best solutions below

2
On BEST ANSWER

This is no different from the standard method to convert between a linear map and a matrix, given fixed bases of the domain and codomain. In your case (take $m = n = 3$ for example), you have \begin{align*} \mathbf{A}\mathbf{x_1} &= \mathbf{I}\mathbf{e_1} = \mathbf{e_1} \\ &= (-1)(-\mathbf{e_1}) + 0(-\mathbf{e_2}) + 0(-\mathbf{e_3}) \\ &= (-1)\mathbf{y_1} + 0\mathbf{y_2} + 0\mathbf{y_3}, \end{align*} which makes $a_{11} = -1$, $a_{21} = 0$, and $a_{31} = 0$, from the given definition. This defines the first column to be $$\left(\begin{matrix}-1 \\ 0 \\ 0\end{matrix}\right.$$ Similar computation reveals that $$[\mathbf{A}] = \begin{pmatrix}-1 & 0 & 0 \\ 0 & -1 & 0 \\ 0 & 0 & -1 \end{pmatrix},$$ i.e. the negative identity matrix, as Berci predicted in the comments.

This procedure should always work; $\mathbf{A}\mathbf{x_i}$ is an element of $Y$, and thus can always be expressed as a unique linear combination of the basis $\mathbf{y_1}, \ldots, \mathbf{y_m}$. So, the $a_{ij}$s will always exist and be unique (for fixed linear transformations and bases).

0
On

Most of the times, for a linear transformation $\alpha:X\to X$, i.e. when $X=Y$, we consider the same basis in the domain as in the codomain.
With that constraint, the matrix of the identity function $X\to X$ is always the identity matrix.

However, if we take two different bases $x_1,\dots, x_n$ and $y_1,\dots, y_n$, then the construction will not produce the identity matrix, but a change-of-basis matrix, the columns of which are just $x_i$, coordinated in the other basis $y_1,\dots, y_n$.

0
On

I have the background of linear algebra before studying the mathematical analysis and found the definition of Matrices quite confusing. Thanks to @Robert Shore, I noticed the difference of linear transformation $A$ and matrix $[A]$ (be care of that on Rudin's notation).

I think other answers are not clear enough to help those who think of matrix as linear equations or a set of vectors.


Then every $A\in L(X,Y)$ determines a set of numbers $a_{ij}$ such that

The trick here is that linear transformation $A$ and corresponding matrix $[A]$ is not arbitrary. Given the bases $\{\mathbf{x_1}, \mathbf{x_2}, \cdots, \mathbf{x_n}\}$ and $\{\mathbf{y_1}, \mathbf{y_2}, \cdots, \mathbf{y_n}\}$, if their orders are fixed, the $A$ if thereby fixed. You can reference the answer by @Theo Bendit in my case. Even if the orders of bases are arbitrary, the $A$ is still under some restrictions. In my example, you can set $\mathbf{y_i}=\mathbf{e_{3-i}}$ and try to find $[A]$.

The "every $A$" in the context means "every reasonable $A$".

Hope it can help those from other backgrounds.

0
On

The definition is simply saying that for fixed vector spaces $X$ and $Y$ (over the same field $F$) and fixed bases $\{ \mathbf{x_i} \}$ and $\{ \mathbf{y_j} \}$ there is a one-to-one correspondence between the linear maps $L(X,Y)$ from $X$ to $Y$ and the $ m \times n$ matrices over $F$.

Given a linear map $A$ its matrix representation $[A]$ is simply the matrix with column $i$ equal to the co-ordinates of the image of the basis vector $ \mathbf{x_i} $ relative the the basis $\{ \mathbf{y_j} \}$.