Wikipedia's explanation of the lambda-calclulus form of Curry's paradox makes no sense

1k Views Asked by At

Wikipedia gives multiple explanations of Curry's paradox, one of which is expressed via lambda calculus.

However, the explanation doesn't look like any lambda calculus I've ever seen, and there's an existing discussion-page section that would appear to indicate I'm not alone.

The proof is given as follows:

Consider a function $r$ defined as

$$r = ( λx. ((x x) → y) )$$

Then $(r r)$ $\beta$-reduces to

$$(r r) → y$$

If $(r r)$ is true then its reduct $(r r) → y$ is also true, and, by modus ponens, so is $y$. If $(r r)$ is false then $(r r) → y$ is true by the principle of explosion, which is a contradiction. So $y$ is true and as $y$ can be any statement, any statement may be proved true.

$(r r)$ is a non-terminating computation. Considered as logic, $(r r)$ is an expression for a value that does not exist.

The last sentence appears to be irrelevant (or perhaps it's intended as a corollary), so I'll ignore it.

The primary question on the talk page is what $→$ means in a lambda expression. I hypothesized that it might be supposed to indicate that $(x → y)$ is an expression that evaluates to $y$ if $x$ is "true" for some meaning of "truth" in lambda calculus (perhaps if $x$ is a Church-numeral greater than 0) and to something else otherwise. But in there's no "else" expression specified, so that doesn't seem right, unless $\beta$-reduction on $(x → y)$ is non-terminating for $x=\bar 0$.

The proof then claims that if $(r r)$ is false (again, it's not clear what "true" or "false" means in this context), then the principle of explosion may be invoked, but it's not clear why $(r r)$ being false would imply a contradiction, so I don't know why the principle of explosion can be invoked here.

Is this demonstration of the paradox correct but simply poorly explained, or is it incorrect?

1

There are 1 best solutions below

5
On

Long Comment

Here is an outline of Curry's original version of the paradox, using combinatory logic [hoping that someone smarter than me can derive the correct "$\lambda$-version"].

See Haskell Curry & Robert Feys & William Craig, Combinatory Logic. Volume I (1958), Ch.8A. THE RUSSELL PARADOX, page 258-59 or Katalin Bimbò, Combinatory Logic : Pure Applied and Typed (2012), Ch.8.1 Illative combinatory logic, page 221-on.

The system is called illative combinatory logic, that extends pure combinatory logic with the inclusion of new constants that, of course, expands the set of CL-terms.

The new symbol [Bimbò, page 221-22]:

$\mathsf P$ is the symbol for $\to$ that is, for implication. [...] $\mathsf Pxx$ is thought of as $x \to x$ , and $\mathsf P(\mathsf Pxy)(\mathsf P(\mathsf Pyz)(\mathsf Pxz))$ is $(x \to y) \to ((y \to z) \to (x \to z))$ .

This formula is a theorem of classical logic (with formulas in place for $x, y$ and $z$).

DEFINITION 8.1.2. Let the set of constants be expanded by $\mathsf P$. Further, let the set of CL-terms [built from the constants into] $\{ \mathsf S, \mathsf K, \mathsf P \}$. The set of theorems comprises equalities (in the extended language) provable with the combinatory axioms restricted to $\mathsf S$ and $\mathsf K$, and assertions obtainable from the assertions (A1)–(A3) by rules (R1)–(R2).

(A1) $\mathsf PM(\mathsf PNM)$ [compare with the propositional axiom : $M \to (N \to M)$]

(A2) $\mathsf P(\mathsf PM(\mathsf PNR))(\mathsf P(\mathsf PMN)(\mathsf PMR))$ [compare with : $(M \to (N \to R)) \to ((M \to N) \to (M \to R))$]

(A3) $\mathsf P(\mathsf P(\mathsf PMN)M)M$ [Peirce's axiom : $((M \to N) \to M) \to M$]

(R1) $\mathsf PMN$ and $M$ imply $N$ [i.e. detachment]

(R2) $M$ and $M = N$ imply $N$.

This system is called the minimal illative combinatory logic. The axioms together with (R1) are equivalent — with $M$ and $N $thought of as formulas rather than CL-terms — to the implicational fragment of classical logic.

The "implication" is characterized by the following theorems [Bimbò, page 223]:

$\mathsf PMM$ : self-implication [i.e. $M \to M$]

$\mathsf P(\mathsf PM(\mathsf PMN))(\mathsf PMN)$ : contraction [i.e. $(M \to (M \to N)) \to (M \to N)$].

In addition, we need [Bimbò, page 12 and page 47]

the fixed point combinator, often denoted by $\mathsf Y$. The axiom for $\mathsf Y$ is $\mathsf Yx \rhd x(\mathsf Yx)$.

Now for the proof of Curry’s original paradox [Bimbò, page 224]:

Let the meta-term $M$ stand for $\mathsf C \mathsf PN$ [...] $\mathsf C \mathsf PNx \rhd_w \mathsf PxN$, and so $\mathsf C \mathsf PN(\mathsf Y(\mathsf C \mathsf PN)) \rhd_w \mathsf P(\mathsf Y(\mathsf C \mathsf PN))N$. Of course, $\mathsf Y(\mathsf C \mathsf PN) \rhd_w \mathsf C \mathsf PN(\mathsf Y(\mathsf C \mathsf PN))$. That is, using $M$, we have that $\mathsf YM = \mathsf P(\mathsf YM)N$. Again we construct a proof (with some equations restated and inserted) to show that $N$ is a theorem.

  1. $\mathsf P(\mathsf P(\mathsf YM)(\mathsf P(\mathsf YM)N))(\mathsf P(\mathsf YM)N)$ [contraction]

  2. $\mathsf YM = \mathsf P(\mathsf YM)N$ [provable equation above]

  3. $\mathsf P(\mathsf P(\mathsf YM)(\mathsf YM))(\mathsf P(\mathsf YM)N)$ [by (R2) from 1. and 2.]

  4. $\mathsf P(\mathsf YM)(\mathsf YM)$ [self-implication]

  5. $\mathsf P(\mathsf YM)N$ [by (R1) from 3. and 4.]

  6. $\mathsf YM$ [by (R2) from 5. and 2.]

  1. $N$ [by (R1) from 5. and 6.]

Curry's original exposition [Curry, Feys & Craig, page 4] starts from :

the paradox of Russell. This may be formulated as follows: Let $F(f)$ be the property of properties $f$defined by the equation

(1) $F(f) = \lnot f(f)$,

where "$\lnot$" is the symbol for negation. Then, on substituting $F$ for $f$, we have :

(2) $F(F) = \lnot F(F)$.

If, now, we say that $F(F)$ is a proposition, where a proposition is something which is either true or false, then we have a contradiction at once. But it is an essential step in this argument that $F(F)$ should be a proposition. [...] The usual explanations of this paradox are to the effect that $F$, or at any rate $F(F)$, is "meaningless". Thus, in the Principia Mathematica the formation of $f(f)$ is excluded by the theory of types.

Following this analysis, Curry "manufactured" the paradoxical combinator $\mathsf Y$ [CFC, page 177] :

Let $Neg$ represent negation. Then the definition (1) becomes

$Ff=Neg(ff) = \mathsf B Neg ff = \mathsf W(\mathsf B Neg)f$.

Thus the definition (1) could be made in the form

(3) $F \equiv \mathsf W(\mathsf BNeg)$.

This $F$ really has the paradoxical property. For we have [that] $FF$ reduces to its own negation.

The definition (3) can be generalized to :

$\mathsf Y = \mathsf W \mathsf S(\mathsf B \mathsf W \mathsf B)$.

We have seen that axioms (A1)-(A3) and the two rules define the implicational fragment of classical logic. We have seen also that if $Neg$ stands for negation, then $\mathsf YNeg$ is equal to its own negative.

[CFC, page 258] it is clear that we cannot ascribe to $\mathsf Y Neg$ properties characteristic of propositions; and that we can avoid the paradox by formulating the category of propositions in such a way that $\mathsf YN$ is excluded from it.



Note. Here is a list of combinators, given by their axioms [Bimbò, page 6]:

$\mathsf I x \rhd x$ : identity combinator

$\mathsf M x \rhd xx$ : duplicator

$\mathsf K xy \rhd x$ : cancellator

$\mathsf W xy \rhd xyy$ : duplicator

$\mathsf B xyz \rhd x(yz)$ : associator

$\mathsf C xyz \rhd xzy$ : permutator

$\mathsf S xyz \rhd xz(yz)$

The other basic definitions are that of one-step reduction and of weak reduction (denoted by $\rhd_w$), i.e. the reflexive transitive closure of the one-step reduction relation.