Confused about notation in Ch.9 of Baby Rudin

Question

Confused about notation in Ch.9 of Baby Rudin

397 Views Asked by Bumbble Comm At 28 Mar 2026 - 11:28

So, I am taking a real analysis course, and we are currently on chapter 9 of baby rudin - Functions of Several Variables.

I do not have a strong linear algebra background, so I am having some difficulties with this chapter. I want to have a more intuitive understanding of the notation used in chapter 9, in regards to vectors and matrices.

If we consider $\mathbb{R^n}$ and a standard basis for $\mathbb{R^n}$ is $\{e_1, ..., e_n\}$, then we can write a vector $\mathbf{x}\in\mathbf{R^n}$ with a set of scalars, $\{c_1, ..., c_n \}$ so that

$\mathbf{x} = \sum_i^n c_i\mathbf{e_i}$. Okay, sure. Why this notation though? I always thought of a vector as just a list of components: ie $\mathbf{x} = (c_1x_1, ..., c_nx_n)$. Why are they being summed up here? Moreover, when Rudin goes on to define a matrix, (say $A$) he writes

\begin{bmatrix} a_{11} & a_{12} & a_{13} & \dots & a_{1n} \\ a_{21} & a_{22} & a_{23} & \dots & a_{2n} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ a_{d1} & a_{d2} & a_{d3} & \dots & a_{dn} \end{bmatrix}

However, isn't a matrix just a collection of vectors? Is there a "closed" summation notation for a matrix? Is there a proper, more succinct way to refer to a matrix? Why does he write vectors as summations and matrices as matrices?

And if $A: \mathbb{R^n} \rightarrow \mathbb{R^m}$, and $\{y_1, ..., y_m\}$ is a standard basis for $\mathbb{R^m}$, then why does he write

$A\mathbf{x} = \sum_i^m \sum_j^n a_{ij}c_i\mathbf{y_j}$? Do we really just have a set of scalars in $\mathbb{R^m}$?

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

Vectors should not be thought of mainly as "lists". Vectors are elements of vector spaces, i.e. sets of things that behave in a certain way – intuitively, like arrow-vectors from high school: they can be summed together, and they can be scaled by constants. You can find a precise definition of a vector space here.

If $V$ is real vector space, a linear combination of vectors is an expression $a_1 v_1 + \cdots + a_kv_k$, for finite $k$, which yields another vector in $V$. So, for example, if $V$ is the set of continuous functions over $(0,1)$, and $v_1,\dots,v_k$ are continuous functions over $(0,1)$, then all their linear combinations are also continuous functions over $(0,1)$ (you should have proved this when you first saw continuity). The numbers $a_i$ are called coefficients of the combination.
A subset $S$ of $V$ is said to be linearly independent, or we say that its elements are linearly independent, when there is no linear combination of its elements that yields the zero vector and has at least a non-zero coefficient. In other words, if $S$ is linearly independent and a linear combination of its elements yields the null vector, then all the coefficients must be $0$. Going back to $C^0((0,1))$, you should see that the monomials $\{x^1,\dots, x^k\}$ for $k \geq 2$ are linearly independent, i.e. $a_1 x^1 + \cdots + a_kx^k = 0$ implies $a_1 = \cdots = a_k = 0$. A more intuitive example would be a set of points in $\mathbb R^3$ such that no two points lie on the same line passing through the origin. (This last example helps in visualising that any subset of $V$ that contains the zero vector cannot be linearly dependent.)
A subset $B$ of $V$ is said to be a (Hamel) basis of $V$ when every vector in $V$ can be written as a linear combination of the elements in a finite subset of $B$, and the elements of $B$ are linearly independent. For example, the set of real polynomials of degree at most two, $\mathbb R_2[x] = \{a_0 + a_1 x + a_2x^2\}$, is a real vector space, and a basis of this space is $\{1,x,x^2\} \subset \mathbb R_2[x]$. Indeed the monomials are linearly independent, and every polynomial in $\mathbb R_2[x]$ can be expressed as a finite linear combination of these three vectors. This choice of basis is not unique: another valid basis is $\{2 - x, -x^2, x + 3x^2\}$. It can be shown that all bases of a vector space have the same number of elements, and that number is the dimension of $V$, $\dim V$.

So, once you have picked a basis, you can show that all $v \in V$ can be expanded as a linear combination of basis vectors in a unique way: $$v = a_1v_1 + \cdots + a_nv_n $$ that is, the coefficients $a_1,\dots,a_n$ are uniquely assigned to $v$. This means that, once a basis $(v_i)$ has been chosen, there is a bijection $\Phi$ between $V$ and $\mathbb R^n$, namely the one sending $v$ to $(a_1,\dots,a_n)$.

Now let me define what a matrix is. Technically, you could say that a real matrix $A$ of order $n\times m$ is a map from a subset $\{(i,j) \in \mathbb N^2\ |\ 1 \leq i \leq n, 1 \leq j \leq m\}$ of $\mathbb N^2$ to the real numbers. Intuitively, though, we do not think of a matrix as a map: we prefer to describe directly it by the values it attains, i.e. as an array of numbers

$$A = \begin{bmatrix} a_{11} & \cdots & a_{1m} \\ \vdots & \ddots & \vdots \\ a_{n1} & \cdots & a_{nm} \end{bmatrix} $$

The set of all real $m \times n$ matrices is denoted $M_{n \times m}(\mathbb R)$ and (surprise!) it can be shown that it is a vector space with the natural definition of matrix addition and scalar multiplication. Actually, it turns out that you can even define a matrix product ($\neq$ scalar multiplication!) between matrices of different dimensions, $\cdot : M_{n \times m}(\mathbb R) \times M_{m \times k}(\mathbb R) \to M_{n\times k}(\mathbb R)$ such that $A\cdot B = C$ with $c_{ij} = \sum_{k=1}^m a_{ik}b_{kj}$.

Back to vectors. Suppose $V$ and $W$ are finite-dimensional real vector spaces of dimensions $n$ and $m$ respectively. A linear map between $V$ and $W$ is a function $f$ such that, for all $v,v' \in V$ and $a,a' \in \mathbb R$, $f(av + a'v') = af(v) + a'f(v')$. The space of all linear functions from $V$ to $W$ is denoted $\mathrm{Hom}(V;W)$. Now let $(v_i)$ be a basis to $V$ and $(w_j)$ a basis to $W$, and let $f \in \mathrm{Hom}(V;W)$. Since every $f(v_i)$ is a member of $W$, we may expand it through the basis $(w_j)$ as $$f(v_i) = \sum_{j=1}^m a_{ij} w_j $$ So, in some sense, once bases have been chosen in the domain and codomain of $f$, there is a correspondence between $f$ itself and a matrix $(a_{ij})$. It turns out that this association has many nice properties, prominently that if $f,g$ are linear maps and $A,B$ their associated matrices, the matrix associated with $f \circ g$ is $A \cdot B$.

Indeed, $v \in V$ and $w = f(v) \in W$, we may apply the bijection $\Phi$ described above to associate $v$ to the $n$-tuple $(b_1,\dots,b_n) \in \mathbb R^n$, and $w$ to the $m$-tuple $(c_1,\dots,c_m) \in \mathbb R^m$, and then "tip" the tuples to their sides (make them "column vectors", i.e. matrices with one column). Having done this, it is easy to resort to the definition of matrix product and see that, if $(a_{ij})$ is the $m \times n$ matrix associated with $f$, we have $$\begin{bmatrix} c_1 \\ \vdots \\ c_m \end{bmatrix} = \begin{bmatrix} a_{11} & \cdots & a_{1m} \\ \vdots & \ddots & \vdots \\ a_{n1} & \cdots & a_{nm} \end{bmatrix} \begin{bmatrix} b_1 \\ \vdots \\ b_m \end{bmatrix} $$

You may apply this discussion in the case $V = \mathbb R^n$ and $W = \mathbb R^m$, which is the one relevant to analysis. Just notice that in these spaces there is an obvious choice of basis (the so-called standard or canonical basis), i.e. the tuples $\mathbf e_i$ that contain all zeros, except a $1$ in the $i$-th slot. In most cases, at least in analysis, matrices are associated to linear functions with the understanding that canonical bases have been chosen both in the domain and the codomain.

Confused about notation in Ch.9 of Baby Rudin

There are 1 best solutions below

Related Questions in REAL-ANALYSIS

Related Questions in LINEAR-ALGEBRA

Related Questions in NOTATION

Trending Questions

Popular # Hahtags

Popular Questions