Mathematical Rigor in Proving Limits by $\epsilon-\delta$ Definition

2.2k Views Asked by At

I am trying to find the most mathematically rigorous way to prove limits, using the $\epsilon-\delta$ definition of a limit, so far I have found two clear cut methods of proving limits using the $\epsilon-\delta$ definition of a limit.


When attempting to prove an arbitrary limit exists :

$$\lim_{x \ \to \ a} f(x) = L$$

The two methods are

  1. Finding a function $\delta(\epsilon)$, that satisfies the inequality $0 <|x-a| < \delta \implies |f(x)-L| < \epsilon \ $ (a personal method that I find useful and intuitive, albeit a bit long)
  2. Given an arbitrary $\epsilon > 0$, we "choose" a value of $\delta$, which'll vary depending on our function, such that if $0 < |x-a| < \delta$, our chosen value of $\delta$ chosen implies $|f(x)-L| < \epsilon \ $ (this is the method most professors, including my own, and books on Real Analysis tend to prefer to use)

The crux of both methods is that we try to find a value (i.e. to show that it exists) for $\delta$ (the distance to the limit point), that implies the limit $L$ exists within our "error distance" $ \ \epsilon\ $, for all values of $\epsilon > 0$ (i.e. for all possible "error distances").


What we are doing when we use the $\epsilon - \delta$ definition to prove a limit:

By showing that this $\ \delta$ exists for all $ \epsilon > 0$, we've shown that no matter how small we make our "error distance" $\epsilon$, we can always find a corresponding $\delta$ ...

The above paragraph is just explaining the $\forall \epsilon>0 \ (\exists \delta > 0$) part of the definition

Such that the $\delta$ we find implies that that limit $L$ exists within our "error distance" $\epsilon$

This paragraph above is just explaining this part of the definition :

$0 < |x-a| < \delta \implies |f(x)-L| < \epsilon$


To show why I think neither method I've outlined above is fully rigorous (I could be wrong), I will use both methods to prove a simple limit, and I will add some notes to either proof to show why they are correct, but also why they fall short of full mathematical rigor.


Example - Prove : $\ \lim_{x \to 1} \ \ \frac{2+4x}{3} = 2$


Proof Using Method #1

I start out with $0 < |x-a| < \delta$ and use it to prove that it implies $|f(x)-L| <\epsilon$

$$0 < |x-1| < \delta$$ Through some algebraic manipulation we arrive at $$0 < \left|\ \frac{4x+2}{3}-2 \ \right| < \frac{4}{3}\delta$$ Letting $f(x) = \frac{4x+2}{3}$ and $L=2$, we can see that we have something similar to what we want to arrive at $$0 <\left|f(x)-L\right| < \frac{4}{3}\delta$$ This suggests that we choose $\frac{4}{3}\delta = \epsilon \implies \delta = \frac{3}{4}\epsilon$ $$\implies |f(x)-L|<\epsilon \ , \ \ \ \forall \ (\delta =\frac{3}{4}\epsilon)\implies \delta(\epsilon) = \frac{3}{4}\epsilon$$

Therefore we have proven that given any $\epsilon > 0$, we can always find a $\delta>0$, that satisfies the inequality $0 <|x-a|<\delta \implies |f(x)-L|<\epsilon$

$$ Q.E.D.$$

Some notes on my proof.

(and on writing delta as a function of epsilon)

Since $\delta$ is dependent on $\epsilon$ I've written $\delta$ as a function of $\epsilon$ , i.e. $\delta(\epsilon)$. Sometimes this is seen as a "no-no" when writing $\epsilon - \delta$ proofs, because there exists multiple $\delta$'s satisfying the condition for a given $\epsilon$, not just one

This implies that $\delta(\epsilon)$ is not a function as for any input ($\epsilon$) we would have multiple output ($\delta$'s)

However since we are only trying to prove the existence of a single $\delta$ (to satisfy the $\epsilon-\delta$ definition of a limit), for a given $\epsilon$ shouldn't writing $\delta(\epsilon)$ still be considered mathematically rigorous given the fact that we are only finding a function $\delta(\epsilon)$ that has a co-domain of a subset of the set of all possible $\delta$'s, given the domain as the set of all possible $(\epsilon)$'s such that for any given $\epsilon$ as an input there will only be one $\delta$ as an output of the function?

$$\text{Codomain of } \delta(\epsilon) \ \ \subset \ \ \text{all possible} \ \ \delta's \ \ \text{for a given} \ \ \epsilon$$

Furthermore that function $\delta(\epsilon)$ is well-defined over its domain $\forall \epsilon \in \mathbb{R^{+}}$, iff we are only looking at a subset of all possible $\delta$'s as for any input ($\epsilon$), we will always get one output $\delta$, like the function $\delta(\epsilon) = \frac{3}{4}\epsilon$ in the example above. So it really should be mathematically rigorous to write $\delta(\epsilon)$ iff we are only showing the existence of a single $\delta$ for any given $\epsilon > 0$ correct?

If not what are possible ways to workaround this, and introduce further mathematical rigor to this last step?


Proof Using Method #2

(credit must go to StackExchange user Daniel W. Farlow for writing this as succinctly as possible)

Proof. Given $\epsilon>0$, choose $\delta=\frac{3}{4}\epsilon$. If $|x-1|<\delta$, then $$ \left|\frac{2+4x}{3}-2\right|=\frac{4}{3}|x-1|<\frac{4}{3}\delta=\frac{4}{3}\left(\frac{3}{4}\epsilon\right)=\epsilon. $$ Thus, if $|x-1|<\delta$, then $\left|\frac{2+4x}{3}-2\right|<\epsilon$. Therefore, by the definition of a limit, $\lim_{x\to 1}\frac{2+4x}{3}=2$, as desired. $$Q.E.D$$

Some notes on this proof

We start off with an arbitrary $\epsilon > 0$ and use it to show that a $\delta$ exists. We then need to show that the value of $\delta$ that we have "chosen" to check and see if it implies the limit $L$ exists within our error distance. $\epsilon$ i.e. we have to show that the value of $\delta$ we have chose $\implies |f(x)-L| < \epsilon$.

But the part that makes me think this not to be fully mathematically rigorous, is the act of "choosing as value" for $\delta$, by choosing a value for $\delta$, we are restricting ourselves to only looking at a small subset of set of all the possible values $\delta$ can take on for a given $\epsilon$, as there are many possible $\delta's$ for a given $\epsilon$. What about the other possible values of $\delta$ do we just disregard them? I realize that the definition requires that we prove the existence of only a single $\delta$, but it doesn't "feel" rigorous (if you can say that) to omit all other possible values of $\delta$.


What my question boils down to.

Question 1: In both proof methods #1, and #2, we restrict ourselves to only proving one $\delta$ exists out of the set of all possible $\delta$'s for a given $\epsilon$. I realize that the definition requires us only to prove the existence of a single $\delta$ for a given $\epsilon$, but it doesn't seem fully rigorous to me (perhaps I'm wrong and my concept of mathematical rigor is incorrect?) to disregard and omit proving the existence of other possible $\delta$'s. Are there are other, perhaps more advanced proof methods in Real Analysis, prove add extra mathematical rigor, and allow us to prove the existence of multiple $\delta$'s for a given $\epsilon$. If so I would really like to hear about them.

Question 2: Furthermore, from what I can see, proof methods #1, and #2 are essentially the same, tackling the proof from different starting points. I also believe proof methods #1 and #2 to be of the same level of mathematical rigor (correct me if I'm wrong), because as I outlined in the notes under each proof method, that both prove the existence of a single $\delta$ for a given $\epsilon > 0$. Am I correct in this assumption? Are proof methods #1 and #2 of the same level of mathematical rigor or is one proof method more rigorous than the other?

5

There are 5 best solutions below

0
On BEST ANSWER

When $\delta(\varepsilon)$ is written as you have above, it is merely a notational reminder that our choice of $\delta$ has to depend on the $\varepsilon$ we're given -- nothing more. In fact, $\delta$ also depends on $f$, $a$, and $L$. Writing $\delta(\varepsilon)$ does not mean that $\delta$ is a function to which we may plug in $\varepsilon$ to get our limit-satisfying $\delta$-value. In that vein, the also-common notation $\delta_\varepsilon$ could be argued to be better. However, we could create an actual function which acts in the spirit of the aforementioned $\delta(\varepsilon)$ and addresses your objection that we're "throwing out" other perfectly good values of $\delta$. This is most vivid if we restrict our attention to the following setup.

Let $A \subseteq \mathbb R$ be open and $f \colon A \longrightarrow \mathbb R$ have limit $L$ at $a$: $$ \lim_{x \to a} f(x) = L. $$

We may define \begin{align} \begin{split} \delta_*(\varepsilon) &= \sup\{ \delta > 0 : a-\delta < x < a \implies |f(x) - L| < \varepsilon \}, \\ \delta^*(\varepsilon) &= \sup\{ \delta > 0 : a< x < a + \delta \implies |f(x) - L| < \varepsilon \}, \end{split} \tag{1} \end{align} with the idea that $\delta_*(\varepsilon)$ tells you how far left of $a$ you can let $x$ go while keeping $|f(x) - L| < \varepsilon$, and $\delta^*(\varepsilon)$ tells you how far right of $a$ you can let $x$ go while keeping $|f(x) - L| < \varepsilon$. (We know that $\delta_*, \delta^* > 0$ exist because those sets on the RHS of $(1)$ are nonempty according to the limit definition.) Hence the largest open $x$-interval for which $|f(x) - L| < \varepsilon$ is $$ X(\varepsilon) = \big( a - \delta_*(\varepsilon), a + \delta^*(\varepsilon) \big). $$ An issue here is that $X(\varepsilon)$ is not (necessarily) symmetric about $a$, so it doesn't (necessarily) correspond to $|x - a| < \delta$ for any $\delta$. To remedy this, we define $\hat \delta (\varepsilon) = \min\{\delta_*(\varepsilon), \delta^*(\varepsilon)\}$; then any $x$ in the interval $$ X'(\varepsilon) = \big( a - \hat \delta(\varepsilon), a + \hat \delta (\varepsilon) \big) $$ will satisfy $|f(x) - L| < \varepsilon$. Note that $X'(\varepsilon) = \{ x : |x - a| < \hat \delta(\varepsilon)\}$, and hence any $\delta$ in the interval $I(\varepsilon) = \big( 0, \hat \delta(\varepsilon) \big]$ satisfy the $\varepsilon$-$\delta$ definition of our limit. Moreover, $I(\varepsilon)$ is the largest set of values of $\delta$ that will work for a given $\varepsilon$. In other words:

$\delta$ satisfies the $\varepsilon$-$\delta$ definition $\iff \delta \in I(\varepsilon)$.

This answers your Question 1. A proof of this follows @grand_chat's answer. Note that $I(\varepsilon)$ depends on $a$, $f$, and $L$ implicitly.

One thing that may bother you is that $X(\varepsilon)$ may be much bigger than $X'(\varepsilon)$, so we're "throwing out perfectly good $x$'s". The $\varepsilon$-$\delta$ definition restricts $X(\varepsilon)$ to a symmetric interval ($X'(\varepsilon)$) about $a$. Does this help address your rigor question?

Of course satisfying the definition of a limit only requires us to find one such $\delta$. The reason that this is what you describe as the preferred method by professors etc. is the existence of complicated functions $f$ which make computing $I(\varepsilon)$ very difficult: it amounts to solving $f(x) = L \pm \varepsilon$ for $x$, which is inverting $f$. Since they don't need to find $I(\varepsilon)$, but just a single point in it, they opt for less work.

Your example of a "linear" function happens to be one in which the imprecise $\delta(\varepsilon)$ which people often write coincides with $\delta_*(\varepsilon) = \delta^*(\varepsilon) = \hat \delta (\varepsilon)$ in a quite canonical way, which may deceive people into believing some property of uniqueness for $\delta(\varepsilon)$.

3
On

"..from what I can see, proof methods #1, and #2 are essentially the same..". Yes,basically they are.

But concerning the fundamental question of rigourousness, well, what is more rigorous than using the definition to prove the desired result as the $\epsilon - \delta$ approach does?

You may argue-and my feeling is that your "healthy confusion" stems from it-that the definition itself is weak. Well, in a way Pointwise Convergence is weak-it is weaker than Uniform Convergance. Are you familiar with it yet? Perhaps that will help clarify things a bit.
This answer in MSE -and the whole post-might help with that.

Just a gut feeling on my part but I think that what you seek is indeed a stronger notion of convergance which applies for certain funcitons, one with better properties, and Uniform Convergance is just that.

(Perhaps this should have had a better place as a comment but it is too long for that.)

2
On

Proof 2 is certainly a valid proof, but is not very enlightening. Where did that value for $\delta$ come from?

Proof 1 is more likely to convince your professor that you have done your work. The first part of the proof, which is fine, says essentially that if $|x-a|<\delta$ then $|f(x)-L|< g(\delta)$, where $g$ is some function of $\delta$. Now, in order to be rigorous, you should really continue with something like this: Thus, choose $\epsilon > 0$, and let $\delta = \frac{3}{4}\epsilon$. Then if $|x-a|<\delta$, the above argument shows that $$|f(x)-L| < \frac{4}{3}\delta = \frac{4}{3}\cdot\frac{3}{4}\epsilon = \epsilon.$$

I believe that this is what you intended in the second part of your Proof 1, but what you've written is not that clear. (To be a Real Proof, it should somewhere contain the language "Choose an arbitrary $\epsilon > 0$", or something close to that.)

All that said, it's clear that you do understand what is going on with $\epsilon-\delta$ proofs, and either proof that you've written is fine except for minor proof-technique details.

0
On

The reason why it's enough to find a single $\delta$ is: If you've found a $\delta$ that meets the requirement, then every positive $\delta'$ that is less than $\delta$ also meets the requirement. Reason: If $|x-a|<\delta'$ and $\delta'<\delta$, then for sure $|x-a|<\delta$, so continue on to the conclusion regarding $f(x)$ vs $L$ that was justified by the original $\delta$. For this reason (IMO) it is perfectly fine and mathematically rigorous to write $\delta=\delta(\epsilon)$.

0
On

Like MathematicianByMistake I also feel that the word "rigorous" doesn't really capture the distinction you're drawing, but I can't discern your intent well enough to offer a substitute. My guess is that you really mean "quantitative", in the sense that you are concerned not just with the existence of $\delta$ but in knowing the largest possible value of $\delta$ for each $\epsilon$.

This approach may seem fine for toy examples, but as the function grows more complex you incur a lot of extra baggage to determine the optimal $\delta$ values for each $\epsilon$. The beauty of the standard definition of limits is that it abstracts away a lot of this baggage, and captures something essential about the function.

IMO, it would be a mistake to confuse quantitativeness with "rigor" or "mathematicalness": in fact, one gets a lot of power from shifting the definition of continuity even further away from quantitativeness... that direction leads to the important field of topology.