How to explain some gaps in understanding of the following interpretation of a given proof?

80 Views Asked by At

I've been teaching myself linear algebra, and have come to the point where I'm studying the theorem which states that RREF matrices are always unique.

I found this proof, but I'm not sure if my understanding is entirely complete.

I'll start with rank, taken from the proof:

" First, having a matrix with two reduced row echelon forms means there are two reduced row echelon form matrices $E$ and $E'$ and an inversible matrix $P$ such that $E'=PE$. The goal is then to prove $E=E'$ (of size $m\times n$).

Now, first remark they must have the same rank $r$, which we can assume at least one or the result is obvious, so we have $r\geqslant1$."

My understanding is that a matrix's rank is essentially the amount of its rows (or columns) which are linearly independent.

With a rank of 1, we have a matrix with at least one linear independent row; we'll call this $R_1$. Any other row $R_i$ can be derived from a linear combination of $\alpha R_1$, where $\alpha$ is some constant such that $\alpha \ne 0$; or is the result of $\alpha R_1 + R_i$ for $0 \le i, j \le m$, where $E$, $E^{\prime}$, and $P$ are $m \times n$ matrices.

Moving on,

" Denote by $p_1,\dots,p_r$ the indices of the pivot columns in $E$, and likewise $p_1',\dots,p_r'$ for $E'$, and complete by $p_{r+1}=p_{r+1}'=n$ for convenience. "

My assumption is that, because $r$ was referred to as the rank of our matrices, $r$'s value also applies in the subscript of these indices. Furthermore, the statement $p_{r+1} = p^{\prime}_{r+1} = n$ appears to imply that the very last column of $E$ is actually linear independent; otherwise, we would have $n = p_{r} = p^{\prime}_{r}$.

The author then continues with the following method:

" Define for $j$ such that $0\leqslant j\leqslant r$ the assertion $H_j$ by the following three conditions:

1. $p_{j+1}=p_{j+1}'$

2. the columns 1 to $p_{j+1}-1$ are equal in $E$ and $E'$

3. the first $j$ columns of $P$ are like those of the identity matrix

It is clear that $H_r$ gives $E=E'$. "

Defining $H_r$: according to the assertion, the first condition evaluates to

  1. $p_{r+1} = p^{\prime}_{r+1} = n$

While the second just states that $E = E^{\prime}$.

The third, on the other hand, sets P to the identity of an $m \times n$ matrix.

If we apply this to the statement $E^{\prime} = PE$, it's not hard to verify its correctness for columns $p_r = p^{\prime}_r$. Rather, if we consider for a moment that $\vec{u}$ is a column vector in $\mathbb{R}^m$, and $\vec{u}$ is equivalent to the column $p_r$ of $E$, then

$$\vec{u}^{\prime} = P\vec{u} = \vec{u}$$

because P is the identity matrix.

Ok, so we've established that.

This is where things really begin to trip me up, though:

"

Let us start with $H_0$. If we multiply the relation $E'=PE$ with the matrix whose $p_1-1$ columns are $k_1,\dots,k_{p_1-1}$, we are in the part of $E$ where all columns are zero, and hence multiplying by $P$, we still get zero.

"

What I'm gathering from this about RREF is that, given a matrix $M$ in RREF form, there is no guarantee that the first pivot element for $M$ will be $[M]_{11}$.

This actually seems to somewhat contradict my understanding of the nature of a system of linear equations, because a system must have a first variable of some kind, right? In other words, if we have the following system:

$$ \begin{matrix} a_{11}x_1 + \cdots + a_{1n}x_n = b_1 \\ \vdots \\ a_{m1}x_1 + \cdots + a_{mn}x_n = b_m\\ \end{matrix} $$,

where $1 \le m, n$, there should be at least one equation with $a_{ij}x_j$ where $a_{ij} \ne 0$, and $1 \le j \le n$ for all $x_j$. In other words, there must be at least one non-zero instance of each variable within the entire system.

If this is the case, then this implies that the following RREF matrix is impossible:

$$ \begin{bmatrix} 0 & a_{12} & 0 & 0 & \cdots & a_{1n}\\ 0 & 0 & a_{23} & 0 & \cdots & a_{2n}\\ 0 & 0 & 0 & a_{34} & \cdots & a_{3n}\\ \vdots & \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & 0 & \cdots & a_{mn} \\ \end{bmatrix} $$

because $x_1$ cannot be completely zero'd out.

The last quote insinuates this with its definition for $H_0$, in particular that there is a set of columns right before the first pivot column. These columns then must have all zeros, and are indexed by $k_1,\cdots, k_{p_1-1}$.

This is the main road block; everything else is for the most part clear.

Questions

1) why does $j$ begin at 0 when referring to $p$?

2) how does making $p_{r+1} = p^{\prime}_{r+1} = n$ add convenience?

3) It seems that there are situations where undefined values emerge. For example, $p_{1} - 1$ is stated for j = 0; how exactly is this guaranteed to exist? Isn't the first pivot column for matrix derived from a system of linear equations always the very first element of the matrix, thus applying it to be 0? Or, are there other valid reasons for this?

1

There are 1 best solutions below

3
On

I think what makes that proof hard to follow is that it uses matrix algebra exclusively. This is a lot simpler if you use the following - I am basically following a proof outlined in Cullen (Theorem 2.18):

If $A=PB$ and $P$ is invertible then the row space of $A$ and $B$ is the same. (you can try proving this? - the proof is also in the same textbook)

So now you can get that the non-zero rows of any RREF (say $E$) of $A$ will span the row space of $A$. In fact the non-zero rows of $E$ will be a basis for the row space of $A$, since the non-zero rows of $E$ are linearly independent. This comes down to: any two vectors with first non-zero entry in different component positions are linearly independent (again, you can prove it yourself if you want to).

What this gives us is that if the pivot columns are columns $k_1, k_2, \ldots, k_n$ then we cannot have a non-zero vector $\gamma$ in the row space of $A$ which has all zeros in components $k_1, k_2, \ldots, k_n$. Without loss of generality we can further assume that $k_1< k_2< \ldots< k_n$ as a direct consequence of the definition of RREF.

Suppose $E'$ is another RREF of $A$. Surely $E'$ must have the same pivot columns as $E$, else you will have a vector in $E'$ which is linearly independent from the vectors in $E$. Let $\rho_1$ be the first non-zero row of $E$ and $\gamma_1$ the first non-zero row of $E'$ (leading 1's in position $k_1$). Then $\rho_1-\gamma_1$ has all zeros in components $k_1, k_2, \ldots, k_n$, and since both are in the row space of $A$ we also have $\rho_1-\gamma_1$ is in the row space of $A$ (row space is a subspace and so is closed under vector addition and scalar multiplication). So in fact $\rho_1-\gamma_1=0$. We can repeat this argument for each of the remaining $k_i$ to show that all of the non-zero rows of $E$ and $E'$ are the same.