Variable Condition in Typed Lambda Calculus

155 Views Asked by At

tl;wr

In which way does the variable condition for the typing of $\forall_x$, carried over by the $\forall_x$-introduction rule, limit the type setting? Is there more to it then keeping it in line with minimal logic?

Context

At the moment I'm trying to make myself more familiar with natural deduction, λ-calculus and such, using Proofs and Computations (Schwichtenberg, Wainer).

Within Part I, a natural deduction calculus is introduced for minimal logic, containing the $\forall^+ x$ rule ($\forall$-introduction) as follows

$$ \begin{array}{c c} \hspace{1.4em}\rule[-0.25em]{0.05em}{1.2em}\hspace{0.3em}M\\ A\\ \hline \forall_xA \end{array} $$

where $x$ needs to fulfill the variable side condition

$$\textsf{ty-var-cond}(x, M) \Leftrightarrow x \text{ is not free in any open assumption within }M$$

Considering that this calculus is meant to be designed in such a way that it is complete with respect to first-order minimal semantic, I kind of understand why this assumption is needed. A free variable in an open assumption appears somehwere on the left of any $\Gamma \vDash A$ and is therefore designated; it restricts the models in question dependent on itself, hence concluding an universal statement, which quantifies over every possible model, is invalid.

For example

$$\textsf{IsRed}(x) \vDash \forall_x \textsf{IsRed}(x)$$

is not a logical consequence in general. On the lhs, $x$ appears to be some specific object (for each model checked), while the rhs is interpreted in such a way that one quantifies over every possible (model modified with $x$ interpreted as a different) object.

Question

Now, applying Curry-Howard and obtaining a typed $\lambda$-calculus, $\textsf{ty-var-cond}$ carries over; the term

$$(\lambda_xM^A)^{\forall_x A}$$

is valid if $\textsf{ty-var-cond}(x, M)$ is fulfilled. (Mind the gap, $\lambda$ binds a object, i.e. type, variable here, not a $\lambda$-term variable.)

This seems to put a restriction on the type system to keep it in line with minimal logic.

(a) Is there anything more to this then keeping the type system in line with minimal logic?

(b) Does this side condition restrict the type system to some known features (rank-1-polymorphism ...)?

Own Ideas

So, I tried to find issues when ignoring the condition but I stumpled already finding a good perspective (semantic) to evaluate what I've found. A type judgment of the form

$$u : \textsf{IsRed}(x) \vdash (\lambda_x u)^{\forall_x \textsf{IsRed}(x)}$$

containing a free variable in an open assumption (here, $u$ is the name of the open assumption and as before $x$ is the free type variable) seems the most easy one which is not allowed following the variable condition. But what does it even mean to have a free variable on the left-hand side here? I've always thought of the left-hand side in type judgments like a prelude of a programming language (e.g. Haskell) which gives types to primitives; but in this perspective I cannot make sense of a free variable.

Disclaimer

It's clear to me that there are none or even several answers to this question, but certainly no "right" answer since I'm asking for a justification which fits into my personal network of meanings... nevertheless I hope stackexchange is fine with it.

Thanks in advance.

1

There are 1 best solutions below

1
On BEST ANSWER

It doesn't really have anything particular to do with minimal logic. Indeed, to get full intuitionistic or classical logic, the rule wouldn't change at all, you'd simply add other connectives and rules. In some sense the $\forall I$ rule captures (one half of) what "universal quantification" means in general.

As you've already started to see, it really doesn't even make sense to drop the side-condition. If you slightly reorganize how you view things, you can eliminate the need for the side-condition and present a much more coherent view of what is going on. The key change is to only consider terms with respect to a variable context. These are known as terms-in-context, and this induces a notion of formulas-in-context. A variable context is a (finite) set of variables annotated with sorts (i.e. types) in the multi-sorted case. A term in some variable context $V$ is defined just as normal except only variables in $V$ are allowed as variables. In particular, if $V=\varnothing$, the terms are closed. I will tacitly allow a term in context $V$ to be used as a term in context $V'\supseteq V$, though there is some value to making this injection explicit.

Write $\Gamma\vdash_V\varphi$ to mean that $\varphi$ and each $\psi\in\Gamma$ are formulas in context $V$. The $\forall I$ rule can thus be rendered as: $$\cfrac{\Gamma\vdash_V\varphi(x)}{\Gamma\vdash_{V\setminus\{x\}}\forall x.\varphi(x)}$$ No side-conditions are needed. For the bottom to be well-formed, it must be the case that $x$ is not free in any $\psi\in\Gamma$. Actually, I would prefer to write this as: $$\cfrac{\Gamma\vdash_{V\cup\{x\}}\varphi(x)\qquad x\notin V}{\Gamma\vdash_V\forall x.\varphi(x)}$$ which has a bit of a side-condition, but it's really just saying that $x$ is a "new" or "fresh" variable, and this is mostly an artifact of using named variables.

This "term-/formula-in-context" idea also interfaces better with semantics. The meaning, i.e. interpretation, of a formula varies depending on its variable context, and since not all variables in a variable context have to occur in the formula these differences can be otherwise invisible. For example, the formula $x=x$ is interpreted as the domain set $D$ when viewed as having the variable context $\{x\}$. If it is viewed as having the variable context $\{x,y\}$, then it is interpreted as $D\times D$.

At this point it should be clear that the side-condition isn't some artificial "limitation" that can be relaxed. In the corresponding lambda calculus, the side-condition reflects the fact that $\lambda$ is a binding form. Dropping the side-condition would mean allowing the lambda-bound variable to be used outside of its scope.

As a bit of an aside at this point, you say: "$\lambda$ binds a[n] object, i.e. type, variable here, not a $\lambda$-term variable." It's not quite clear to me what you're saying here, but it sounds incorrect. As I mentioned in the comments, for first-order logic we get a dependently typed lambda calculus, most naturally $\lambda\Pi$. In this context, we often write $\forall x:\mathbb N.P(n)$ as $\Pi x:\mathbb N.P(x)$. We then have $(\lambda x:\mathbb N.M(x)):\Pi x:\mathbb N.P(x)$ where $M(n) : P(n)$ for $n:\mathbb N$. At any rate, the $x$ the $\forall/\Pi$ binds is the same $x$ that $\lambda$ binds, and in the example given will be bound to natural numbers. $A\to B$ is shorthand for $\Pi x:A.B$ so the simply typed lambda calculus embeds as the sublanguage where no type actually depends on another type. Referring back to your sentence, "object variables" and "$\lambda$-term variables" are the same thing and there are no "type variables" as that is usually meant. If you add universes (or rather embed into a dependently typed lambda calculus with universes), you can actually treat types as terms allowing you to identify (predicative) "type variables" with term variables. In particular, we can view formulas as terms of type $\mathcal U$ where $\mathcal U$ is the universe.

Returning to FOL, I suspect you may be thinking, "this perspective is nice and all, but what does happen if we drop the side-condition in the traditional presentation?" As you already started to notice, dropping the side-condition produces inconsistency/unsoundness, but still, what is actually happening in the traditional presentation? Before I answer that, I'll present what $\forall E$ looks like in my style: $$\cfrac{\Gamma\vdash_V \forall x.\varphi(x)}{\Gamma\vdash_V\varphi(t)}\qquad\text{where }t\text{ is some term in context }V$$ If you're paying very close attention, you may have realized that this means that we can't prove $\vdash_\varnothing(\forall x.\bot)\to\bot$ if the signature has no constant symbols. Why? Because there are no terms in context $\varnothing$ if there are no constant symbols. In other words, in my presentation it is not the case that the domain must be non-empty. What's happening in the traditional presentation is that variables and constant symbols are partially conflated. Free variables behave as constant symbols including forcing the domain to be non-empty. This reflects itself in the semantics. Some authors define the satisfaction relation with respect to an initial valuation for all variables. For these authors $\mathsf{IsRed}(x)\vDash\forall x.\mathsf{IsRed}(x)$ would be completely meaningful and would behave exactly like you suggested. That is, the $x$ on the left hand side behaves exactly like a constant symbol. The side-condition is to avoid cases where a variable is simultaneously reasoned about as a variable and as a constant symbol. The semantic reflection of this would be treating a variable as a projection of a Cartesian product of the domain and as an element of the domain. Suffice it to say, I think this conflation is a mistake that makes it much more difficult to build a coherent picture of what's going on.