Why doesn't Gaussian elimination change the solution set?

4.1k Views Asked by At

Of course, Gaussian elimination is safe to use as proven by the countless systems I've solved with it while practising my linear algebra (which is I must add very basic and low-level), but when Jim Hefferon's free textbook on linear algebra raised the question why

  1. Scaling rows by a non-zero constant
  2. Adding rows to each other

Do not change the solution set, I found myself incapable of giving correct mathematical proof.

In his answer manual, Hefferon himself "proves" the safety of both operations by showing that each operation can be reversed without adding or losing solutions.

For example, if a row is scaled by a nonzero constant C, this operation can be reversed by dividing both sides by C, without losing or creating solutions.

Does this satisfy as mathematical proof? To me it seems like proving and operation through its usage isn't exactly proving its correctness, because it makes the assumption that the solution set wasnt changed between performing and reversing the operation. If this indeed does not satisfy as proof, then what would be proof that no solutions are lost performing Gaussian elimination?

3

There are 3 best solutions below

0
On

ALmost. Maybe the best way to see this is as follows: If $C$ is an invertible matrix that the set $S_1$ of vectors $x$ such that $Ax=b$ is the same as the set $S_2$ of vectors $x$ such that $CAx=Cb$. This follows because for each $x$ with $Ax=b$ we immediately find $C\cdot Ax=Cb$ and vice verse for each $x$ with $CAx=Cb$ we find by multiplying with $C^{-1}$ (which exists by the assumption of invertibility) $Ax=C^{-1}CAx=C^{-1}Cb=b$. Therefore we have bothe $S_1\subseteq S_2$ and $S_2\subseteq S_1$.

Now to apply trhis to Gauss elimination observe that the single steps (scaling a row, adding a row to another row, swapping rows) can be accieved by multiplying with a suitable simple matrix $C$ with an easily found inverse.

4
On

It is a little hard to write all the equations down but I'll try to explain the process.

There are three fundamental operations we perform on a linear system.

  • Multiplying a row by a scalar.
  • Interchanging two rows.
  • Adding a scalar multiple of a row to another a row.

The thing to notice is that after performing any of these three operations the resultant system consists of equations that are linear combinations of the original one.

For an example say we have a system,

\begin{array}{c} a_1x+b_1y+c_1z=d_1 \\ a_2x+b_2y+c_2z=d_2 \\ a_3x+b_3y+c_3z=d_3 \end{array}

Say we multiply the first row by $q$ and add it to the second. The resultant system is,

\begin{array}{c} a_1x+b_1y+c_1z=d_1 \\ q(a_1x+b_1y+c_1z) + a_2x+b_2y+c_2z=q \cdot d_1 + d_2 \\ a_3x+b_3y+c_3z=d_3 \end{array}

Now say $(x, y, z)^T$ was a solution to the original system. Then, it is also a solution to the second one. Let us try to convince ourselves of this. The first and last rows are not a problem. The second equation of the resultant system is also satisfied because $a_2x+b_2y+c_2z= d_2$ and $(a_1x+b_1y+c_1z) = d_1 \implies q (a_1x+b_1y+c_1z) = q \times d_1 $.

The other two linear operations are also similarly disposed of.

Now look at what we have proven. We have proven that "any solution to the original system is a solution to the system resulting from one of the three linear operations".

But we require a little more. We want the solutions of the new system to be exactly those of the first one. This is established by the fact that every linear operations mentioned has a corresponding inverse operation which is also one of the three linear operations. For an example the inverse operation of the one we performed above is multiplying the first row of the second system by $-q$ and adding to the second row. Now think of the original system as resulting from the second one through the performance of a linear operation. Hence from what we proved above any solution to the second system is also a solution to the first.

So any solution to the original system is a solution to the resultant system and any solution to the resultant system is a solution to the original one. Hence the solutions to the first system are exactly those to the second.

This is exactly what we require.

There is a nice explanation of all this in the first twenty or so pages in "Linear Algebra by Hoffman, Kunze". Must read.

0
On

Instead of thinking about the linear algebra (i.e., matrix/vector) justification, I like to look at the underlying algebra problem (i.e., the system of linear equations). In the world of the algebraic equations, the Gaussian Elimination (GE) on the linear algebra structures corresponds to the rules you learn when first trying to solve an equation. Namely,

  1. adding the same quantity to both sides of the equation does not change the solution.
  2. multiplying both sides of the equation by a constant different from zero does not change the solution.

If you take these rules for granted, then you must believe GE is simply a compact way to write down the operations that you do on the system of linear equations, using the language of matrices. In particular, adding two rows is fine, since the quantities you add on the left hand side and right hand side are equal (since they satisfy another one of the equations of the system).

Matrices/vectors are an interesting world. However, when it comes to their use in linear systems, I like to think of them simply as a nice compact way to write down something that otherwise would take a lot of space on the paper, allowing to have a good vision of the big picture, without getting lost in the details of each single component.