I am trying to find the most mathematically rigorous way to prove limits, using the $\epsilon-\delta$ definition of a limit, so far I have found two clear cut methods of proving limits using the $\epsilon-\delta$ definition of a limit.
When attempting to prove an arbitrary limit exists :
$$\lim_{x \ \to \ a} f(x) = L$$
The two methods are
- Finding a function $\delta(\epsilon)$, that satisfies the inequality $0 <|x-a| < \delta \implies |f(x)-L| < \epsilon \ $ (a personal method that I find useful and intuitive, albeit a bit long)
- Given an arbitrary $\epsilon > 0$, we "choose" a value of $\delta$, which'll vary depending on our function, such that if $0 < |x-a| < \delta$, our chosen value of $\delta$ chosen implies $|f(x)-L| < \epsilon \ $ (this is the method most professors, including my own, and books on Real Analysis tend to prefer to use)
The crux of both methods is that we try to find a value (i.e. to show that it exists) for $\delta$ (the distance to the limit point), that implies the limit $L$ exists within our "error distance" $ \ \epsilon\ $, for all values of $\epsilon > 0$ (i.e. for all possible "error distances").
What we are doing when we use the $\epsilon - \delta$ definition to prove a limit:
By showing that this $\ \delta$ exists for all $ \epsilon > 0$, we've shown that no matter how small we make our "error distance" $\epsilon$, we can always find a corresponding $\delta$ ...
The above paragraph is just explaining the $\forall \epsilon>0 \ (\exists \delta > 0$) part of the definition
Such that the $\delta$ we find implies that that limit $L$ exists within our "error distance" $\epsilon$
This paragraph above is just explaining this part of the definition :
$0 < |x-a| < \delta \implies |f(x)-L| < \epsilon$
To show why I think neither method I've outlined above is fully rigorous (I could be wrong), I will use both methods to prove a simple limit, and I will add some notes to either proof to show why they are correct, but also why they fall short of full mathematical rigor.
Example - Prove : $\ \lim_{x \to 1} \ \ \frac{2+4x}{3} = 2$
Proof Using Method #1
I start out with $0 < |x-a| < \delta$ and use it to prove that it implies $|f(x)-L| <\epsilon$
$$0 < |x-1| < \delta$$ Through some algebraic manipulation we arrive at $$0 < \left|\ \frac{4x+2}{3}-2 \ \right| < \frac{4}{3}\delta$$ Letting $f(x) = \frac{4x+2}{3}$ and $L=2$, we can see that we have something similar to what we want to arrive at $$0 <\left|f(x)-L\right| < \frac{4}{3}\delta$$ This suggests that we choose $\frac{4}{3}\delta = \epsilon \implies \delta = \frac{3}{4}\epsilon$ $$\implies |f(x)-L|<\epsilon \ , \ \ \ \forall \ (\delta =\frac{3}{4}\epsilon)\implies \delta(\epsilon) = \frac{3}{4}\epsilon$$
Therefore we have proven that given any $\epsilon > 0$, we can always find a $\delta>0$, that satisfies the inequality $0 <|x-a|<\delta \implies |f(x)-L|<\epsilon$
$$ Q.E.D.$$
Some notes on my proof.
(and on writing delta as a function of epsilon)
Since $\delta$ is dependent on $\epsilon$ I've written $\delta$ as a function of $\epsilon$ , i.e. $\delta(\epsilon)$. Sometimes this is seen as a "no-no" when writing $\epsilon - \delta$ proofs, because there exists multiple $\delta$'s satisfying the condition for a given $\epsilon$, not just one
This implies that $\delta(\epsilon)$ is not a function as for any input ($\epsilon$) we would have multiple output ($\delta$'s)
However since we are only trying to prove the existence of a single $\delta$ (to satisfy the $\epsilon-\delta$ definition of a limit), for a given $\epsilon$ shouldn't writing $\delta(\epsilon)$ still be considered mathematically rigorous given the fact that we are only finding a function $\delta(\epsilon)$ that has a co-domain of a subset of the set of all possible $\delta$'s, given the domain as the set of all possible $(\epsilon)$'s such that for any given $\epsilon$ as an input there will only be one $\delta$ as an output of the function?
$$\text{Codomain of } \delta(\epsilon) \ \ \subset \ \ \text{all possible} \ \ \delta's \ \ \text{for a given} \ \ \epsilon$$
Furthermore that function $\delta(\epsilon)$ is well-defined over its domain $\forall \epsilon \in \mathbb{R^{+}}$, iff we are only looking at a subset of all possible $\delta$'s as for any input ($\epsilon$), we will always get one output $\delta$, like the function $\delta(\epsilon) = \frac{3}{4}\epsilon$ in the example above. So it really should be mathematically rigorous to write $\delta(\epsilon)$ iff we are only showing the existence of a single $\delta$ for any given $\epsilon > 0$ correct?
If not what are possible ways to workaround this, and introduce further mathematical rigor to this last step?
Proof Using Method #2
(credit must go to StackExchange user Daniel W. Farlow for writing this as succinctly as possible)
Proof. Given $\epsilon>0$, choose $\delta=\frac{3}{4}\epsilon$. If $|x-1|<\delta$, then $$ \left|\frac{2+4x}{3}-2\right|=\frac{4}{3}|x-1|<\frac{4}{3}\delta=\frac{4}{3}\left(\frac{3}{4}\epsilon\right)=\epsilon. $$ Thus, if $|x-1|<\delta$, then $\left|\frac{2+4x}{3}-2\right|<\epsilon$. Therefore, by the definition of a limit, $\lim_{x\to 1}\frac{2+4x}{3}=2$, as desired. $$Q.E.D$$
Some notes on this proof
We start off with an arbitrary $\epsilon > 0$ and use it to show that a $\delta$ exists. We then need to show that the value of $\delta$ that we have "chosen" to check and see if it implies the limit $L$ exists within our error distance. $\epsilon$ i.e. we have to show that the value of $\delta$ we have chose $\implies |f(x)-L| < \epsilon$.
But the part that makes me think this not to be fully mathematically rigorous, is the act of "choosing as value" for $\delta$, by choosing a value for $\delta$, we are restricting ourselves to only looking at a small subset of set of all the possible values $\delta$ can take on for a given $\epsilon$, as there are many possible $\delta's$ for a given $\epsilon$. What about the other possible values of $\delta$ do we just disregard them? I realize that the definition requires that we prove the existence of only a single $\delta$, but it doesn't "feel" rigorous (if you can say that) to omit all other possible values of $\delta$.
What my question boils down to.
Question 1: In both proof methods #1, and #2, we restrict ourselves to only proving one $\delta$ exists out of the set of all possible $\delta$'s for a given $\epsilon$. I realize that the definition requires us only to prove the existence of a single $\delta$ for a given $\epsilon$, but it doesn't seem fully rigorous to me (perhaps I'm wrong and my concept of mathematical rigor is incorrect?) to disregard and omit proving the existence of other possible $\delta$'s. Are there are other, perhaps more advanced proof methods in Real Analysis, prove add extra mathematical rigor, and allow us to prove the existence of multiple $\delta$'s for a given $\epsilon$. If so I would really like to hear about them.
Question 2: Furthermore, from what I can see, proof methods #1, and #2 are essentially the same, tackling the proof from different starting points. I also believe proof methods #1 and #2 to be of the same level of mathematical rigor (correct me if I'm wrong), because as I outlined in the notes under each proof method, that both prove the existence of a single $\delta$ for a given $\epsilon > 0$. Am I correct in this assumption? Are proof methods #1 and #2 of the same level of mathematical rigor or is one proof method more rigorous than the other?
When $\delta(\varepsilon)$ is written as you have above, it is merely a notational reminder that our choice of $\delta$ has to depend on the $\varepsilon$ we're given -- nothing more. In fact, $\delta$ also depends on $f$, $a$, and $L$. Writing $\delta(\varepsilon)$ does not mean that $\delta$ is a function to which we may plug in $\varepsilon$ to get our limit-satisfying $\delta$-value. In that vein, the also-common notation $\delta_\varepsilon$ could be argued to be better. However, we could create an actual function which acts in the spirit of the aforementioned $\delta(\varepsilon)$ and addresses your objection that we're "throwing out" other perfectly good values of $\delta$. This is most vivid if we restrict our attention to the following setup.
We may define \begin{align} \begin{split} \delta_*(\varepsilon) &= \sup\{ \delta > 0 : a-\delta < x < a \implies |f(x) - L| < \varepsilon \}, \\ \delta^*(\varepsilon) &= \sup\{ \delta > 0 : a< x < a + \delta \implies |f(x) - L| < \varepsilon \}, \end{split} \tag{1} \end{align} with the idea that $\delta_*(\varepsilon)$ tells you how far left of $a$ you can let $x$ go while keeping $|f(x) - L| < \varepsilon$, and $\delta^*(\varepsilon)$ tells you how far right of $a$ you can let $x$ go while keeping $|f(x) - L| < \varepsilon$. (We know that $\delta_*, \delta^* > 0$ exist because those sets on the RHS of $(1)$ are nonempty according to the limit definition.) Hence the largest open $x$-interval for which $|f(x) - L| < \varepsilon$ is $$ X(\varepsilon) = \big( a - \delta_*(\varepsilon), a + \delta^*(\varepsilon) \big). $$ An issue here is that $X(\varepsilon)$ is not (necessarily) symmetric about $a$, so it doesn't (necessarily) correspond to $|x - a| < \delta$ for any $\delta$. To remedy this, we define $\hat \delta (\varepsilon) = \min\{\delta_*(\varepsilon), \delta^*(\varepsilon)\}$; then any $x$ in the interval $$ X'(\varepsilon) = \big( a - \hat \delta(\varepsilon), a + \hat \delta (\varepsilon) \big) $$ will satisfy $|f(x) - L| < \varepsilon$. Note that $X'(\varepsilon) = \{ x : |x - a| < \hat \delta(\varepsilon)\}$, and hence any $\delta$ in the interval $I(\varepsilon) = \big( 0, \hat \delta(\varepsilon) \big]$ satisfy the $\varepsilon$-$\delta$ definition of our limit. Moreover, $I(\varepsilon)$ is the largest set of values of $\delta$ that will work for a given $\varepsilon$. In other words:
This answers your Question 1. A proof of this follows @grand_chat's answer. Note that $I(\varepsilon)$ depends on $a$, $f$, and $L$ implicitly.
One thing that may bother you is that $X(\varepsilon)$ may be much bigger than $X'(\varepsilon)$, so we're "throwing out perfectly good $x$'s". The $\varepsilon$-$\delta$ definition restricts $X(\varepsilon)$ to a symmetric interval ($X'(\varepsilon)$) about $a$. Does this help address your rigor question?
Of course satisfying the definition of a limit only requires us to find one such $\delta$. The reason that this is what you describe as the preferred method by professors etc. is the existence of complicated functions $f$ which make computing $I(\varepsilon)$ very difficult: it amounts to solving $f(x) = L \pm \varepsilon$ for $x$, which is inverting $f$. Since they don't need to find $I(\varepsilon)$, but just a single point in it, they opt for less work.
Your example of a "linear" function happens to be one in which the imprecise $\delta(\varepsilon)$ which people often write coincides with $\delta_*(\varepsilon) = \delta^*(\varepsilon) = \hat \delta (\varepsilon)$ in a quite canonical way, which may deceive people into believing some property of uniqueness for $\delta(\varepsilon)$.