Fundamental questions about CALCULUS and Linear Algebra

290 Views Asked by At

So I'm taking this term three mathematical courses in Computer Science degree: CALCULUS 1M (Especially dealing with the term of limit of sets etc.), Linear Algebra and Discrete math.

I have accumulated so far some essential questions:

  1. First, in CALCULUS 1M we encounter the term of a 'field' - this object which has two operations - from there we have to infer other properties about those fields. Besides that this term is still vague for me. I didn't catch enough the intrinsic difference between a field and a set. I would be glad to get a good explanation to it.

Secondly - the proofs - it has to be very rigorous. To such an extent that many times I really don't know how to start writing an answer - from where to begin? Is it legal to start from this statement or no?.. Very ambiguous. I don't know if you can give me some concrete help concerning it but I will leave it here.

Some technique misunderstanding: I didn't find some good explanation to what relation order is, what an ordered field is. I have learned like a parrot that an ordered field has some properties to be ordered - Consistency to multiplication and addition, transitivity... I don't understand deeply why specifically those are the properties. What is behind this?

Concerning Linear Algebra: We have learned about matrices and the operations on it but again like robots. No one really understands what really this matrix is, why are those operations work on it this way. And generally, we have no intuition about it, how does it look like? Why actually linear equation systems can be solved by this? Why those elementary row operations work?

Discrete math: later on...

Thank you.

3

There are 3 best solutions below

0
On BEST ANSWER

Here's a wordy explanation which is a bit lighter on the advanced mathematics compared to other answers:


A set is roughly a collection of elements. Examples of sets are $\{A, B, F, X\}$, $\{1,-4,\frac{1}{2}, 0, 12\}$, $\{\triangle, \circ, \square \}$, as well as some pre-defined sets like $\mathbb{N}$, $\mathbb{Z}$, and $\mathbb{R}$. Note that elements of sets have no concept of order, no inherent operations can transform one or more elements of a set into another element of a set, and no inherent relationship to one another. However, most of the time, sets contain the same "kind" of objects (letters, numbers, shapes, animals, etc.); this is not a hard requirement though.

Some sets are a little more structured than this though. We might want to consider some kind of number system, where we can add and subtract, multiply and divide, and combine these operations in a meaningful way so that we always get an answer out. Fields are one kind of structured set, motivated by the numbers and operation we use in daily life. There are many "rules" or axioms which need to be satisfied for a set with some operations to be called a field, but basically, a field is just a set of numbers which act like real numbers (or complex numbers) with addition and multiplication. Things like associativity, commutativity, identity, and inverse are required for each of addition and multiplication, and distributivity is required for their combination. With all this structure, you have a lot of room to do algebra on these kinds of numbers, which is great. In order to do calculus (limits), you need some additional structure (completeness), which real and complex numbers have, but not every field does.


In a 100% rigorous system, the only legal statements without proof are axioms. Whatever axioms you have in the system you are trying to prove something under, you can use whenever you want, as they necessarily hold. In addition to axioms, any statements which have been previously proven with the currently legal statements are legal. That is, you must be able to prove a statement with only axioms and the statements you have already proven before you can use that statement to prove any other statements. This is the general process of building a theoretical framework. Very quickly, you stop referencing axioms and start referencing the theorems you have proven, but you need to be careful about taking statements proven by other people. If you see a statement proven by someone else, you can use it (proofs and theorems are not copyright), but you need to be sure that the proof they give is valid within your existing framework.


An ordered field can be considered by the existence of a relation which satisfies certain conditions/axioms, but intuitively, it is exactly what you know as order. If I give you 2 numbers $a$ and $b$, you can say for sure that $a \leq b$ or $b \leq a$, and both of those statements are true if and only if $a = b$. You also know that if $a \leq b$ and $b \leq c$, then $a \leq c$. Lastly, and most obvious, any number $a$ satisfies $a \leq a$ (because $a=a$). These intuitive ideas of order don't necessarily exist in all sets, but ones that do have it are called ordered. If the set is a field, then it is an ordered field. Real numbers are an ordered field, but complex numbers are not (what does it mean to say $3+i \leq 2+2i$?).


Linear algebra, in a sense, starts from linear systems. If you have a system of equations,

$$ 3x+5y-2z = 12 \\ 2x+2y+6z = 3 \\ -x+z = -9 $$

we might want to write it in a shorthand way, where we don't repeat things like $x,y,z$ and the $=$ sign up to 3 times each (or more for higher dimensions). We first look at all the coefficients, which are already laid out in a rectangular shape, so let's just make a box of coefficients

$$ A = \begin{bmatrix} 3 & 5 & -2 \\ 2 & 2 & 6 \\ -1 & 0 & 1 \end{bmatrix} $$

On the right hand side, we have a column of numbers, so let's make a column-shaped box

$$ \mathbf b = \begin{bmatrix} 12 \\ 3 \\ -9 \end{bmatrix} $$

Since the output of the system is a column, we should make the input of the system a column too, that way we can use the output of this system as the input to another one if we want

$$ \mathbf x = \begin{bmatrix} x \\ y \\ z \end{bmatrix} $$

Then, simply define the multiplication of a box and a column to be the way in which we "unwrap" this shorthand: each element in a given row of $A$ is the coefficient of the corresponding column of $\mathbf x$. This means our system can now be written as

$$ A\mathbf x = \mathbf b $$

In this form, we can look at $A$ as a way of transforming a column $\mathbf x$ into another column $\mathbf b$; we say $L(\mathbf x) = A\mathbf x$ is a linear transformation.

So how do we multiply two matrices? Well, since $A\mathbf x$ is the left side of a linear system with coefficients from $A$, $BA\mathbf x$ should be the left side of a linear system with coefficients from $BA$. $A\mathbf x$ is a column, so we can easily calculate $BA\mathbf x$ using our definition of a box times a column. Then, we can simply take the coefficients of the resulting linear system as the components of $BA$. In this way, the standard matrix multiplication definition is formed. The limitations on the size of matrices to be multiplied and the size of their product should be deducible from this definition.

Elementary row operations are simply the kinds of operations you were allowed to do on a linear system, but written in the new shorthand. Swapping rows is like swapping equations, absolutely nothing changes, but the new order might be a more convenient arrangement for you. Multiplying an equation (row) by some constant is obviously allowed, as if $a = b$, then $ca = cb$. For the same reason, we can add two equations (rows); if $a=b$ and $c=d$, then $a+c = b+d$. When adding two equations, it might be confusing why we end up with the same number of equations; simply, the added equation doesn't offer any new information, so we replace one of the summand equations with the new sum equation. Since each of these operations individually doesn't change the value of a system, any combination of these will also preserve the system. Thus, we can freely apply these operations in any order and as many times as we need to reduce the system to one which is solved.

It turns out that the shorthand notation for linear systems has the kind of structure needed to do some algebra, so we study columns (formally vectors), and boxes (matrices) with their associated linear transformations in general. This leads to a large number of techniques for analyzing different kinds of linear systems (algebraic, differential, etc.) which may not have been developed by taking the system of equations at face value.

0
On

You've really asked several questions, so I apologise if I miss some.

A set is just an object of which a given object either is or isn't a "member", or "element". If you want a more formal definition than that, they're entities implicitly defined by their satisfying certain axioms; for the most popular choice thereof, look up the axioms of Zermelo-Fraenkel set theory (which is abbreviated to ZF, or ZFC if the axiom of choice is added).

A field is a set endowed with operations satisfying certain axioms. (Every time someone tells you an insert-name-here is a set endowed with certain operations, there are various ways to think about that; for example, set theory can define ordered tuples, and we can define operations as functions which in turn can be sets etc., and the tuple can then list the set followed by those operations. However, for our purposes none of that matters.) Specifically, we require two operations $+$ and $\times$ on a set $F$ such that:

  • $F$ is an Abelian group under $+$, of identity denoted $0$;
  • $F\backslash\{0\}$ is an Abelian group under $\times$, of identity denoted $1$;
  • $a\times (b+c)=(a\times b)+(a\times c)$ for $a,\,b,\,c\in F$.

For a nontrivial field, $0\ne 1$. Of course, you now have to look up the group theory axioms, but you should see now that fields are a precisely definable concept very different from mere sets.

Staring at the above axioms, you may notice they're just some of the most elementary-school-obvious facts about the usual arithmetic on real numbers. The axioms of an ordered field are also based on trying to capture the intuition we all have for ordering real numbers, and how that ordering responds to arithmetic.

The motive for matrices is that a linear map from vectors to vectors is completely specified by its action on each element of a basis (since every vector space has a basis, at least if you accept the axiom of choice), and the result of such an action is summarised by saying how the image space's basis elements are combined in the result. So we end up with a rectangular array of numbers encoding the map's degrees of freedom. The usual formula for the product of two matrices $M,\,N$ is just the one that implies $(M\times N)\times v=M\times (N\times v)$ for a vector $v$.

0
On
  1. A set is just a collection of things. $\{banana, apple, pear, orange\}$ is a set (axiomatic set theory gets complicated with this, but you really don't need to worry about it).

A field is a set, which I'll call $F$ with binary operations which we'll call $+: F \times F \to F$ and $\bullet: F \times F \to F$ such that [sorry, massive list of conditions incoming - it amounts to "has all of the nice properties that the real numbers have", roughly , though there are some weird things that can happen in fields]:

  1. Addition Axioms:
    1. Associativity: for every $a, b, c \in F$, we have $(a+b)+c = a+(b+c)$.
    2. Commutativity: for every $a, b \in F$, we have $a + b = b + a$.
    3. Identity: there is some element of $F$, which we're going to call $0$, such that, for every $a \in F$, we have $a + 0 = a$.
    4. Inverses: for every $a \in F$, there is some element of $F$, which we're going to call $-a$, such that $a + (-a) = 0$.
  2. Multiplication Axioms:
    1. Associativity: for every $a, b, c, \in F$, we have $(a\bullet b)\bullet c = a \bullet (b \bullet c)$
    2. Commutativity: for every $a, b \in F$, we have $ a \bullet b = b \bullet a$.
    3. Identity: there is some element of $F$, which we're going to call $1$, such that, for every $a \in F$, $1\bullet a = a$.
    4. Inverses: for every $a \in F$ other than $0$, there is some element of $F$, which we're going to call $a^{-1}$, such that $aa^{-1} = 1$.
  3. Other Axioms:
    1. Distributivity: for every $a, b, c \in F$, we have $a \bullet (b + c) = (a \bullet b) + (a \bullet c)$.
    2. Non-Triviality: with $1$ and $0$ defined as above, $1 \neq 0$.

[You might have seen a slightly different list: they're all equivalent]

Now, it should be fairly obvious that there's a whole lot more going on with fields than sets: there's no obvious way of defining addition and multiplication on $\{banana, apple, pear, orange\}$ (and, in fact, it turns out that there's no way to make that into a field at all). Even if we did have some sensible definitions of addition and multiplication, there's no guarantee at all that they'd satisfy all of these things (for example, the set of $n$ by $n$ matrices (with entries in some field $F$) with matrix addition and matrix multiplication does not give a field: in particular, not all matrices have multiplicative inverses).

Secondly - the proofs - it has to be very rigorous. To such an extent that many times I really don't know how to start writing an answer - from where to begin?

Start with the axioms above (or whatever slightly different list of axioms you've got in your course) and things that you've already proven, and go from there.

Is it legal to start from this statement or no?.. Very ambiguous. I don't know if you can give me some concrete help concerning it but I will leave it here.

It's not at all ambiguous: if you've proven the statement, or it's an axiom, you can start from it. If you haven't proven it, and it isn't an axiom, then you can't start from it, because any proof you build on that statement is built on shaky foundations.

I didn't find some good explanation to what relation order is, what an ordered field is.

For order relations: I'll assume you know what a relation is (it's just something that takes two things in a set and tells you if they're related or not: $=$ is the simplest and least-interesting example). An order relation is a relation that behaves likes $\leq$ does for real numbers [NB: there are two different definitions here. I'm going to take the one that I prefer and hope that yours matches: if this is wrong, then when you say "order relation", you mean what I'm going to call a "total order relation" further down]: given things in a set, an order relation is a relation on some set $A$ such that is reflexive ($a \leq a$ for all $a\in A$), transitive (if $a \leq b$ and $b \leq c$, then $a \leq c$), and antisymmetric (if $a \leq b$ and $b \leq a$, then $a = b$).

A total order relation is then an order relation which also has the property that for every $a, b \in A$, either $a \leq b$ or $b \leq a$ (so there's no situation where we're left with two things and just can't compare them).

Now, you can put all sorts of wacky total order relations on a field, but the $\leq$ relation on the real numbers has some extra properties, relating to the field operations (these aren't in the above definition, because we can define total orders on any set we like, not just on fields), so we define an ordered field to be a field $F$ with a total order relation that has all of those extra nice properties: specifically, that for every $a, b, and c \in F$, if $a \leq b$, then $a + c \leq b + c$, and for every $a, b \in F$, if $0 \leq a$ and $0 \leq b$, then $0 \leq ab$. Again, these are all nice properties that we have on the real numbers that we'd like to have everywhere.

Concerning Linear Algebra: We have learned about matrices and the operations on it but again like robots. No one really understands what really this matrix is, why are those operations work on it this way. And generally, we have no intuition about it, how does it look like? Why actually linear equation systems can be solved by this? Why those elementary row operations work?

Given finite-dimensional vector spaces $V$ and $W$, fix bases $B = \{b_1,\ldots,b_n\}$ for $V$ and $C = \{c_1,\ldots,c_m\}$ for $W$. Then any linear map $\varphi: V \to W$ is uniquely determined by where it sends the $b_i$, and sends any such $b_i$ to some sum $\sum\limits_j a_{ij} c_j$, with $a_{ij}$ elements of the base field. Now, if you write those $a_{ij}$ in an $n$ by $m$ square and stick brackets around them, you'll have a matrix, which I'll call $M_B^C(\varphi)$. This doesn't look very useful so far, except that you can notice that $$M_B^C(\varphi)\left(\begin{array}{c}0\\\vdots\\0\\1\\0\\\vdots\\0\end{array}\right) = \left(\begin{array}{c}\sum\limits_j a_{1j}c_j\\\vdots\\\sum\limits_ja_{nj}c_j\end{array}\right)$$ (with the $1$ in the $i$th position), so we have the image of $\varphi(b_i)$ in the $i$th position. Since everything in sight is linear, if we write elements $v$ of $V$ as column vectors $c_B(v)$ in terms of $B$, and similarly in $W$, then we have $c_C(\varphi(v)) = M_B^C(\varphi)c_B(v)$. Thus, we can study linear maps (whcih come up everywhere) by instead studying matrices (which are a lot easier to work with for actual calculations).

As far as where the matrix operations come from: $A + B$ is pretty obvious: it's the matrix of the linear map that you get by adding together the linear maps corresponding to $A$ and $B$. The product $AB$ looks weirder, but if you work through the numbers, you'll see that it's the matrix of the linear map that you get by composing the linear maps corresponding to $A$ and $B$. Similarly, the determinant of a matrix is the signed $n$-volume of the image of the unit cube under the linear map associated to the matrix. Row operations correspond to changing what basis we're using (on one side: column operations do the other side). They're all picked explicitly to make matrices better for letting us study linear maps, which are actually interesting. Linear equations can be solved by doing this because a system of linear equations is one that is of the form $\varphi(x) = a$ for $x, a$ in some finite dimensional vector spaces and $\varphi$ a linear map between them.