So, I am taking a real analysis course, and we are currently on chapter 9 of baby rudin - Functions of Several Variables.
I do not have a strong linear algebra background, so I am having some difficulties with this chapter. I want to have a more intuitive understanding of the notation used in chapter 9, in regards to vectors and matrices.
If we consider $\mathbb{R^n}$ and a standard basis for $\mathbb{R^n}$ is $\{e_1, ..., e_n\}$, then we can write a vector $\mathbf{x}\in\mathbf{R^n}$ with a set of scalars, $\{c_1, ..., c_n \}$ so that
$\mathbf{x} = \sum_i^n c_i\mathbf{e_i}$. Okay, sure. Why this notation though? I always thought of a vector as just a list of components: ie $\mathbf{x} = (c_1x_1, ..., c_nx_n)$. Why are they being summed up here? Moreover, when Rudin goes on to define a matrix, (say $A$) he writes
\begin{bmatrix} a_{11} & a_{12} & a_{13} & \dots & a_{1n} \\ a_{21} & a_{22} & a_{23} & \dots & a_{2n} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ a_{d1} & a_{d2} & a_{d3} & \dots & a_{dn} \end{bmatrix}
However, isn't a matrix just a collection of vectors? Is there a "closed" summation notation for a matrix? Is there a proper, more succinct way to refer to a matrix? Why does he write vectors as summations and matrices as matrices?
And if $A: \mathbb{R^n} \rightarrow \mathbb{R^m}$, and $\{y_1, ..., y_m\}$ is a standard basis for $\mathbb{R^m}$, then why does he write
$A\mathbf{x} = \sum_i^m \sum_j^n a_{ij}c_i\mathbf{y_j}$? Do we really just have a set of scalars in $\mathbb{R^m}$?
Vectors should not be thought of mainly as "lists". Vectors are elements of vector spaces, i.e. sets of things that behave in a certain way – intuitively, like arrow-vectors from high school: they can be summed together, and they can be scaled by constants. You can find a precise definition of a vector space here.
So, once you have picked a basis, you can show that all $v \in V$ can be expanded as a linear combination of basis vectors in a unique way: $$v = a_1v_1 + \cdots + a_nv_n $$ that is, the coefficients $a_1,\dots,a_n$ are uniquely assigned to $v$. This means that, once a basis $(v_i)$ has been chosen, there is a bijection $\Phi$ between $V$ and $\mathbb R^n$, namely the one sending $v$ to $(a_1,\dots,a_n)$.
Now let me define what a matrix is. Technically, you could say that a real matrix $A$ of order $n\times m$ is a map from a subset $\{(i,j) \in \mathbb N^2\ |\ 1 \leq i \leq n, 1 \leq j \leq m\}$ of $\mathbb N^2$ to the real numbers. Intuitively, though, we do not think of a matrix as a map: we prefer to describe directly it by the values it attains, i.e. as an array of numbers
$$A = \begin{bmatrix} a_{11} & \cdots & a_{1m} \\ \vdots & \ddots & \vdots \\ a_{n1} & \cdots & a_{nm} \end{bmatrix} $$
The set of all real $m \times n$ matrices is denoted $M_{n \times m}(\mathbb R)$ and (surprise!) it can be shown that it is a vector space with the natural definition of matrix addition and scalar multiplication. Actually, it turns out that you can even define a matrix product ($\neq$ scalar multiplication!) between matrices of different dimensions, $\cdot : M_{n \times m}(\mathbb R) \times M_{m \times k}(\mathbb R) \to M_{n\times k}(\mathbb R)$ such that $A\cdot B = C$ with $c_{ij} = \sum_{k=1}^m a_{ik}b_{kj}$.
Back to vectors. Suppose $V$ and $W$ are finite-dimensional real vector spaces of dimensions $n$ and $m$ respectively. A linear map between $V$ and $W$ is a function $f$ such that, for all $v,v' \in V$ and $a,a' \in \mathbb R$, $f(av + a'v') = af(v) + a'f(v')$. The space of all linear functions from $V$ to $W$ is denoted $\mathrm{Hom}(V;W)$. Now let $(v_i)$ be a basis to $V$ and $(w_j)$ a basis to $W$, and let $f \in \mathrm{Hom}(V;W)$. Since every $f(v_i)$ is a member of $W$, we may expand it through the basis $(w_j)$ as $$f(v_i) = \sum_{j=1}^m a_{ij} w_j $$ So, in some sense, once bases have been chosen in the domain and codomain of $f$, there is a correspondence between $f$ itself and a matrix $(a_{ij})$. It turns out that this association has many nice properties, prominently that if $f,g$ are linear maps and $A,B$ their associated matrices, the matrix associated with $f \circ g$ is $A \cdot B$.
Indeed, $v \in V$ and $w = f(v) \in W$, we may apply the bijection $\Phi$ described above to associate $v$ to the $n$-tuple $(b_1,\dots,b_n) \in \mathbb R^n$, and $w$ to the $m$-tuple $(c_1,\dots,c_m) \in \mathbb R^m$, and then "tip" the tuples to their sides (make them "column vectors", i.e. matrices with one column). Having done this, it is easy to resort to the definition of matrix product and see that, if $(a_{ij})$ is the $m \times n$ matrix associated with $f$, we have $$\begin{bmatrix} c_1 \\ \vdots \\ c_m \end{bmatrix} = \begin{bmatrix} a_{11} & \cdots & a_{1m} \\ \vdots & \ddots & \vdots \\ a_{n1} & \cdots & a_{nm} \end{bmatrix} \begin{bmatrix} b_1 \\ \vdots \\ b_m \end{bmatrix} $$
You may apply this discussion in the case $V = \mathbb R^n$ and $W = \mathbb R^m$, which is the one relevant to analysis. Just notice that in these spaces there is an obvious choice of basis (the so-called standard or canonical basis), i.e. the tuples $\mathbf e_i$ that contain all zeros, except a $1$ in the $i$-th slot. In most cases, at least in analysis, matrices are associated to linear functions with the understanding that canonical bases have been chosen both in the domain and the codomain.