Why isn't removing zero rows an elementary operation?

3k Views Asked by At

My prof taught us that during Gaussian Elimination, we can perform three elementary operations to transform the matrix:

1) Multiple both sides of a row by a non-zero constant 2) Add or subtract rows 3) Interchanging rows

In addition to those, why isn't removing zero rows an elementary operation? It doesn't affect the system in any way. Define zero rows to be a row with no leading variables.

For example isn't $\begin{bmatrix}a & b & k\\c & d & m\end{bmatrix} \rightarrow \begin{bmatrix}a & b & k\\c & d & m\\0 & 0 & 0\end{bmatrix}$

5

There are 5 best solutions below

0
On

Because we want the dimensions of the matrix to remain fixed. I can think of two reasons for this.

  1. So that performing an elementary row operation will correspond to multiplication on the left by an elementary matrix.

  2. Changing the dimensions of the matrix by adding or removing zero rows does not help us solve the system, so why allow it? The (other) row operations do not change the size of the matrix, so let's keep things simple and fix it once and for all.

0
On

Aside: which row operations are called elementary is purely a matter of convention: we can pick any operations we want to be called the elementary ones. Those three are the ones we chosen.


The problem is that row operations are formal things. e.g. there is a row operation called "multiply the first row row by 5", but there is not a row operation called "multiply the first row by 5 if its first nonzero entry is 0.2".

Along those lines, "remove a row" is a row operation, but "remove a zero row is not".

"Add a zero row" is a row operation, though.


Those zero rows are useful -- or, more accurately, you need to remember what row operations you performed to make that zero row. If $R$ is the matrix that accumulates all of your row operations, then when trying to solve

$$Ax = b$$

you can inspect

$$ RAx = Rb$$

and know that there are no solutions if $Rb$ has a nonzero entry in the position where $RA$ has a zero row.

It's also useful for finding left null vectors to your matrix: if $RA$ is in row echelon form, then the rows of $R$ that correspond to zero rows of $RA$ are a basis for the left nullspace of $A$.

If you do the row operation to eliminate the zero row, you've forgotten all of that useful information.

0
On

Whether you want addition or removal of rows with all entries zero to be allowed or not depends on the precise application one has for using row operations. If row operations are intended to model the action of the general linear group by left multiplication, then addition or removal of rows should be forbidden because the mentioned action never does that. However if the purpose is to define a normal form for subspaces defined by systems of linear equations, then removal of rows is needed to obtain such a normal form (equations with all coefficients zero have no effect on the solution space, and there must be some means to bring equivalent systems with a different number of equations to a common normal form). The normal form would be row reduced echelon form without any zero rows. Addition of zero rows is then also needed to ensure one can get back from the normal form to any system equivalent to it.

2
On

Here is a slightly more long-range answer: a matrix corresponds to a linear operator $T:V\to W$ where $V$ and $W$ are vector spaces with some chosen bases. The elementary row operations (or elementary column operations) then correspond to changing the basis of $W$ or of $V$ to give an equivalent matrix: one which represents the same linear operator but with the bases changed around. Under this correspondence you can get all the possible matrices corresponding to the linear operator $T$ by doing elementary row and column operations.

Adding or removing a row of zeroes will not give you a matrix corresponding to the linear operator $T$.

0
On

Doing an elementary row operation on the matrix $A$ can be seen as multiplying $A$ by a suitable invertible matrix, call it $E_1$. So the whole process leads to writing $$ U=E_{k}\dots E_{2}E_{1}A $$ where $E_j$ $(j=1,2,\dots,k)$ correspond to the elementary row operation we performed and $U$ is a row reduced matrix (for instance, the row reduced echelon form). Since each $E_j$ is invertible, we can write $$ A=FU $$ where $F=E_1^{-1}E_2^{-1}\dots E_{k}^{-1}$ and, if one performs all row swaps at the start, the matrix $F$ can be written even without any computation.

Of course no operation of this kind changes the shape of the matrices we successively get and the final matrix $U$ will have zero or more “zero rows” at the bottom.

At this point we can remove those “zero rows” from $U$; if there are $l$ of them, we can also remove the $l$ rightmost columns of $F$. Calling $F_0$ and $U_0$ the matrices so obtained we still have $$ A=F_0U_0 $$ and both $F_0$ and $U_0$ have their rank equal to the rank of $A$. This is called a full rank decomposition, because those matrices have the maximum rank they possibly have, based on their shape; say that $A$ is $m\times n$ and its rank is $r$. Then $F_0$ will be $m\times r$ $(r\le m)$ and $U_0$ will be $r\times n$ ($r\le n$).

In particular $F_0^HF_0$ ($H$ denotes the hermitian transpose, the simple transpose when the matrices are over the reals) is $r\times r$ invertible and $$ (F_0^HF_0)^{-1}F_0^H $$ not only is a left inverse of $F_0$ but is its Moore-Penrose pseudoinverse $F_0^+$. Similarly, $U_0U_0^H$ is invertible and $U_0^H(U_0U_0^H)^{-1}=U_0^+$. Moreover the Moore-Penrose pseudoinverse of $A$ is $$ A^+=U_0^+F_0^+ $$ Indeed, any time we find a full rank decomposition $A=BC$, we can write $A^+=C^+B^+$ (and the pseudoinverse of $B$ and $C$ are computed as above).

The Moore-Penrose pseudoinverse is useful in computing the least squares solutions of $Ax=b$ and this is an efficient algorithm for finding it.