Is My Description of Systems of Equations Correct?

111 Views Asked by At

I am trying to summarize the concept of systems of equations for a written piece in a formal setting for students. For this, I attempt to give a formal and complete description of the case where we wish to determine the complete solution (that is, the one-solution case) for the system and the proof that to solve such a system we need as many equations as variables.

My question is, is the description I provide valid and complete? Also, as part of that requirement, is the proof that I give also valid and complete?

What I have so far follows, which starts with an example for motivation.

Any single valid equation in one variable addresses exactly one concern. So if we were asked the following question, we would have no way of answering it.

If a car can drive four times the amount of fuel it's carrying in gallons minus three times the number of minutes it has spent standing running idly at traffic lights and we know that a particular car drove for one hundred miles after a refuel, how much fuel did the car start with and how many minutes did it spend running idly since it refueled?

The reason being that we are trying to address two concerns with one equation.

The above scenario is precisely the reason that a formal method was developed to discern the various quantities which may be modeled analytically. As we just discussed, in order to be able to successfully and accurately represent the information being queried, the number of equations must somehow be adjusted to account for an increase in the number of variables being used.

How do we know exactly how many equations are required to correctly derive the needed information? And, how do we use the various equations to this end? Finally, do the equations need to satisfy certain properties individually and collectively?

When we attempt to determine what quantities are represented by the given equations and variables, we must keep in mind the following.

  1. All of the variables being used must be related with each of the other variables.
  2. The number of equations given must at least be "distinct," consistent, and match the number of variables.

Why are these two rules necessary?

For the first, if the variables we are solving for aren't related, then the situation isn't much different than solving various different unrelated equations simultaneously.

As for the second, while the example with the car we covered above should provide a fairly intuitive understanding of why that is so, a more formal treatment may be summarized as follows.

Consider that a single equation in a single variable describes the relationship between the constants and that variable present in the equation. If we added another variable but left the number of equations unchanged, the new equation is true conditionally for the specific simultaneous choices for both variables. If we added another equation (also relating the two variables across the system) which is distinct (not a multiple of the original) and consistent (doesn't contradict the first and isn't contradictory in itself), we will now have two valid equations in a valid system which fully describes the relation between both variables. In other words, for two related real variables $x$ and $y,$ we have that

$$\begin{align} a_1x + a_2y &= c_1 \tag{1}\\ a_3x + a_4y &= c_2 \end{align}$$

where $c_1$ and $c_2$ are real numbers, $a_1, a_2, a_3, a_4$ are all non-zero real numbers such that the system $(1)$ is a valid system and furthermore, for some non-zero real constant $m,\ a_1 = ma_3, a_2 = ma_4, \text{ and } c_1 = mc_2$ doesn't happen simultaneously.

Without loss of generality, we can assume that at least $a_1$ and $a_4$ are non-zero. Thus this system may be solved as such.

$$\begin{align} x &= (c_1 - a_2y)/a_1 \tag{2}\\ y &= (c_2 - a_3x)/a_4\end{align}$$

And in this way, we obtained a complete solution to the system since either both $a_2$ and $a_3$ are zero, or, one of the solutions to the variables can be substituted back into the equation other than the one we got the value to substitute from with, allowing us to solve for the other variable.

Let us now assume, by the principle of induction, that this property holds for $n-1$ equations representing $n-1$ variables, $n$ being an integer greater than one. This tells us that that at least those many coefficients are non-zero that are required for the system to be valid are present. That is, at least those many variables are present in the system in the same manner as in $(2)$ to allow us to solve for the $n-1$ variables in the $n-1$ equation system.

If we now add one more variable across the system and corresponding valid equation to this system and make the justified assumption that at least the coefficient multiplying the new variable is non-zero in at least the new equation, we may solve for this new variable in terms of all the other variables in the new equation. And since according to the prior assumption that we already have the complete solution to the $n-1$ equation system, we know the value of every other variable already and can use that information to compute the value of the new variable from the rearranged form of the new equation. Thus, a valid completely solvable $n-1$ system assures the complete solution to the valid $n$ equation system in $n$ variables.

I am really truly very grateful for any help on this. Thank you so much for reading through what I have written and providing feedback.

Sincerely,

ThisIsNotAnId

1

There are 1 best solutions below

7
On BEST ANSWER

I think examples are good for motivation, but I personally might leave the formal justification and “proof writeup” for after we've learned concepts such as linear independence, rank, and elementary row operations (and you can mention this at the start, as a foreshadowing/preview of things to come). After all, that's kind of what those concepts/definitions are for and why they are the standard: they capture the ideas we want to capture.

For example, to justify that the $n \times n$ system $Ax = b$ has a unique solution if and only if the rows of $A$ are linearly independent, we can do it as follows:

Suppose the rows are linearly independent. Note that elementary row operations don’t change the set of solutions, so we may set out to reduce the matrix in an attempt to find the solutions of our system. Furthermore, they don’t change the row space, so the RREF of $A$ must be the identity matrix (otherwise we get a zero row, which implies fewer than $n$ linearly independent rows in the row space, contradicting the assumption of linear independence). In other words, we get a unique solution.

Conversely, if the rows are linearly dependent, then after row-reduction some row must be zero (otherwise we get $n$ linearly independent row vectors, contradicting the assumption of linear dependence). In particular, this means there is at least one free variable, so either our system has no solution or infinitely many solutions. In particular, it cannot have a unique solution.

However, there might be a nice way to explain all this in simple layman terms, but I don't know how to do it beyond saying something like "the equations have to be independent in a sense; no one can be written as a linear combination of the others. Otherwise, we will have some degree of freedom (an underdetermined system), meaning there cannot be a unique solution."