In learning linear algebra, an overarching question that I have is why there is a very specific set of rules that dictates when a system of equations is solved. The specifications according to stattrek.com:
The first non-zero element in each row, called the leading entry, is 1. Each leading entry is in a column to the right of the leading entry in the previous row. Rows with all zero elements, if any, are below rows having a non-zero element.
just seem oddly contrived to me. This may be more of a question about how it was discovered, but then again I'm not sure. My best guess is that this just happens to be the pattern matrices follow when you've introduced the most zeros into the matrix (you've "eliminated" as many terms from the corresponding system as mathematically possible). I try to understand math as deeply as possible and this question has been bugging me for a while.
Basically, yes.
However, there are many such forms. Some times you want to say something that depends on exactly which form you chose. That means you need to not only have a form with as many zeros as possible, you want one specific such form, and you want anyone reading your argument to agree with you on exactly what form that is.
Enter the row-reduced echelon form. It is one of the many "as many zeroes as possible" forms, but the additional restrictions (which I agree, are a bit lengthy and messy to read) really restricts what form you're talking about to a single alternative.
It also happens to be, among all such alternatives, perhaps the most useful for solving linear equations, as solutions can basically be read off directly, from top to bottom. No need to "scan through" the matrix to see which row would be most convenient to start working on next. It's always the row just below the one you've just dealt with.