Why do so few authors package up the definition of a limit into a function?

413 Views Asked by At

Occasionally, you will literally see people arguing for nonstandard analysis purely to unnest the quantifiers in the definition of a limit. By unnesting, I mean avoiding an exists quantifier that is required to be nested inside the scope of the for all epsilon quantifier, and instead having only foralls that can be made implicit and constraints on the corresponding variables. NSA does this by just constraining $N$ to be infinitely large or $x$ to be infinitely close to $x_0$.

But there's a much simpler way to clean things up so that the quantifiers are no longer nested. Just define functions:

  1. A sequence $a_n$ in a metric space is convergent with limit L if there is a function $N : \mathbf{R^+} \to \mathbf{N}$ such that $d(a_n,L) < \varepsilon$ for all $\varepsilon > 0$ and $n > N(\varepsilon)$.
  2. A sequence $a_n$ in a metric space is a Cauchy series if there is a function $N : \mathbf{R^+} \to \mathbf{N}$ such that $d(a_n,a_m) < \varepsilon$ for all $\varepsilon > 0$ and $n,m > N(\varepsilon)$.
  3. A function $f : X \to Y$ between metric spaces has the limit $L = \lim_{x \to x_0} f(x)$ if there is a function $\delta : \mathbf{R^+} \to \mathbf{R^+}$ so that $d(f(x),L) < \varepsilon$ for any $\varepsilon > 0$ and $d(x,x_0) < \delta(\varepsilon)$.

which lets you write the definitions with only foralls that can be floated out to top level so that you just have one inequality between top level variables implying the other.

Furthermore, actually requiring that this function be defined (occasionally called the modulus of convergence), is both very convenient in epsilon delta proofs, and actually necessary to have for anything related to numerics/computability/constructive analysis or if you generally care about "how many terms do I need" questions.

So this made me curious. Is there any fundamental reason why you would want to write the definition of a limit in terms of nested quantifiers? I'd say tradition, but afaik first order logic is quite a bit newer than the definition of a limit. How did the 19th century analysts state the definition?

2

There are 2 best solutions below

1
On

Here's a possible "logical" answer that may have to do with it, after thinking about it.

As Mark S said in his comment to the question, a proposition written in a form where the quantifiers are foralls that can be floated to top level is said to be in Skolem normal form.

This is an advantage primarily because the quantifiers can be made implicit and the entire proof ends up only needing propositional logic. This is convenient in $\varepsilon - \delta$ proofs because it allows you to reuse epsilons in many cases.

This is in contrast to many proofs written by analysts which tend to be in continuation passing style. This means that analysts then have to either keep defining new epsilons in the scope of the old ones, or use proofs by contradictions or blanket existence statements to "escape" the continuation.

...and this might be the actual reason why. Analysts write proofs in CPS style because of the definition of limits, which often get escaped by using proofs by contradiction in order to keep steps of proofs short and modular. Which means that they end up with more nonconstructive existence lemmas, which are contagious and lead to more CPS style proofs.

Since most results are then stated in terms of existence without error terms, and those are contagious since in some cases you need choice to escape them, in order to cite existing results you have to keep writing proofs in the same style, and analysts just end up building a stronger tolerance to that specific proof style than mathematicians in other fields and become used to it.

1
On

I think your proposed answer of the 'problem' being a lack of focus on constructive mathematics is off the mark. At least in some scenarios, there are constructive differences between the alternating quantifier definitions and the ones using functions.

For your first two examples, consider the following propositions:

  1. All constructive functions are continuous
  2. Every continuous function from $\mathbb R$ to $\mathbb N$ is constant

I don't think restricting yourself to the positive reals changes 2. Anyhow, these two together indicate that your $N$ cannot actually depend on $ε$ constructively, which is unlikely to be what you want, I think.

To examine why the above works out the way it does, I think the third example is helpful. The traditional propositions-as-types way to interpret the ε-δ definition is like:

$$Π_{ε:\mathbb R}Σ_{δ:\mathbb R} ...$$

which is easily shown to yield:

$$Σ_{δ : ℝ → ℝ} ...$$

However, developments like homotopy type theory has shown that this interpretation is not as precise as it could be. This $Σ$ type is not a proposition, in the sense that all values are equivalent. So, to more precisely interpret the ε-δ statement, we should use the propositional truncation:

$$Π_{ε:ℝ}\Vert Σ_{δ:ℝ} ...\Vert$$

And it turns out that the principle:

$$ (Π_{ε:ℝ} \Vert Σ_{δ:ℝ} ...\Vert) → \Vert Σ_{δ : ℝ → ℝ} ...\Vert$$

is a genuine, non-constructive axiom of choice.

The thread tying the three examples together is $ℝ$ as a quotient type, and propositional equivalence allowing more freedom of choice than functions. In order for a function with type $ℝ → ℕ$ to respect the quotient structure of the real numbers, the choice of the natural number cannot depend on the detail of how the real number is approximated, and this ends up implying that the entire function is actually constant. However, all proofs of a proposition are considered equal regardless of the witness used to prove them; this is what the truncation is accomplishing, and in this case, I think it allows you to choose varying naturals based on the details of the approximation, in a way that is not valid for functions from $ℝ$ to $ℕ$. A similar thing can happen with the ε-δ statement, but since $ℝ$ is not discrete like $ℕ$, the discrepancy is more subtle.

I'm not really very knowledgeable on the exact methodology of constructive analysis. You could get around the above by stating things about approximants, rather than the quotient. But, if you want to be really precise about actually using $ℝ$, and stating the theorems in the usual way, you're going to have to be careful about using actual propositional statements, and paying attention to the difference between propositional $∀$-$∃$ and functions, even constructively. This shows up in other scenarios than HoTT, too, like realizability toposes, where interpretations of $∀$-$∃$ are less constrained than functions in general (although weak forms of the axiom of choice may hold, like countable choice).