Help! I don't believe in the identity elimination rule for Martin-Löf type theory/HoTT!

635 Views Asked by At

I was watching this video this video "$\infty$-Category Theory for Undergraduates" by Emily Riehl, and was onboard with everything except the path induction principle for identity types (27:00 minutes in) (using Riehl's color convention):

Given any type family [(context) $\color{gray}{\Gamma}$, (terms) $\color{orange}{x},\color{orange}{y}$ (of type) $:\color{red}{A}$, (and a "proof/witness of equality") $\color{orange}{p} : \color{red}{\text{Id}_A(x,y)}$ $\vdash \color{red}{B(x,y,p)}$ (any other type we form given the data on the LHS of $\vdash$)], to produce a term of type $\color{red}{B(x,y,p)}$ it suffices to do so only in the super special case of: $\color{orange}{y}$ is $\color{orange}{x}$, and $\color{orange}{p}$ is $\color{orange}{\text{refl}_x} : \color{red}{\text{Id}_A(x,y)}$

Riehl admits that "it is confusing if you're seeing it for the first time", and compares the situation to students learning proofs by induction for the first time, where "it's very confusing for a while but after you've done ten exercises writing induction proofs you sort of start to get the hang of it". I do not doubt/dispute that it is similar for path induction. However, the one major difference, is that I believed in the axiom/principle of mathematical induction when I first learned it, but I don't believe in the path induction principle now.

In another Riehl talk "How I became seduced by univalent foundations", at 30:00 minutes in, Riehl introduces this path induction principle (also called the identity elimination rule), and says "it's, uh, ..., magic". So I think Riehl recognizes that this may be a "sticking-point", and something that people do not find very intuitive. In the above talks (in particular the "...Undergrads" one) (and also in e.g. Elimination rule for identity types in Martin-Lof Type Theory), this identity elimination rule can be made more intuitive from the homotopy interpretation. As described at HomotopyTypeTheory.org (which refers to the path induction principle as the "J-rule"),

The interpretation of propositional equalities between paths (i.e. of propositional equalities between propositional equalities) is homotopy (continuous deformation). The intuition for why J holds but K does not is that a PathFrom M has one endpoint free, which gives you additional freedom in how you can deform the path. To deform (N , P) into (M , Refl), you move the endpoint N over to M, while deforming the P into reflexivity.


However, this now begs the question of why this homotopy interpretation of some type theory should have any credence to the wider entirety of mathematics. In this MO post "Homotopy Type Theory: What is it?", Mike Shulman addresses the question of why "people are pushing univalence and constructive type theory as a new foundations for mathematics" with an excellent blogpost "From Set Theory to Type Theory". I am totally on board with what he writes in this article. In particular I think the univalence axiom is quite well motivated:

The solution to this problem is the univalence axiom, which can be concisely described as saying that “isomorphic structures are equal”.

In another n-category cafe post by Riehl, another example is raised: $\mathbb N \simeq \{x \in \mathbb Z: x\geq 0\}$ vs. $\mathbb N =\{x \in \mathbb Z: x\geq 0\}$. Anybody with a semester/year of experience in abstract algebra can appreciate this notion of formalizing "isomorphic things are equal", and moreover the idea of working in a "isomorphism-invariant" environment:

In set theoretic foundations, of course, we could already prove that any two natural numbers objects were isomorphic. In material-set theory, all we can conclude from that is that any “isomorphism-invariant property” of one natural numbers object carries over to another. Then we have to do work (or, as is usually the case, leave work to the reader) proving that any particular property we care about is isomorphism-invariant. The situation in structural-set theory is somewhat better: if a structural-set theory is presented carefully enough, then we can only say isomorphism-invariant things. Both of these approaches, however, eventually lead to dependent type theory anyway: it is the way to characterize the isomorphism-invariant properties, and the careful way to present structural set theories. Moreover, in univalent type theory we can prove the statement “all properties are invariant under isomorphism” directly, rather than only observing it as a meta-property of the theory.

So alright, perhaps I can appreciate how homotopy theory (and how it relates to being able to formalize this philosophy behind the univalence axiom "isomorphic things are equal") can be used in the wider mathematical universe. But I am still hung up on the J-rule, which is not mentioned in Shulman's overview/type-theory advertisement. I do not see what the insight of the J-rule specifically is, in the wider mathematical universe.

Question 1: Since Martin-Löf came up with his type theory decades before the homotical interpretation, there surely is some deeper/more universal mathematical idea he was trying to capture with his definitions. So what were his justifications for introducing this J-rule? ...though, I also heard that there are other rules, a K-rule and an I-rule (also by Martin-Lof?) that are no longer "accepted" today, so perhaps I should not look so hard into Martin-Lof's original motivations...


This nlab post presents an alternative/equivalent definition of the J-rule, in terms of (IIa) "Substitution of identifications preserves computations" (Liebniz's "indiscernibility of identicals"), and (IIb) "An identification identifies itself with a self-identification". nlab essential uniqueness of identification certificates I agree with nlab when it says (IIb) is "subtle", but I don't find it "self-evident" at all. What exactly is this (IIb) fundamentally trying to say about different "proofs of identification with $x$"? That they are "homotopic" in some sense? Why is that natural/reasonable? What insight about everyday proofs does this homotopy interpretation reveal/corroborate?

Compare again the situation to the univalence axiom: it corroborates the everyday intuition/experience of mathematicians treating isomorphic things as equal, and the homotopy interpretation is fundamentally saying that the perspective/tools of homotopy and type theory allow us to work in an "isomorphism-invariant" environment, and to formalize the aformentioned intuition.

Perhaps a more concrete phrasing of the question is: from the homoptopy pictures it looks like if we fix $\color{orange}{x},\color{orange}{y}:\color{red}{A}$, the 2 paths/proofs $\color{orange}{p},\color{orange}{q}:\color{red}{\text{Id}_A(x,y)}$ may not be homotopic. But the J-rule is the statement that fixing only $\color{orange}{x}:\color{red}{A}$ and letting $\color{orange}{y}$ roam free, every $\color{orange}{p}:\color{red}{\text{Id}_A(x,y)}$ is homotopic to $\color{orange}{\text{refl}_x}:\color{red}{\text{Id}_A(x,x)}$. What phenomenon is this "2-fixed points" vs. "1-fixed point" trying to capture about these "proofs of equality" $\color{orange}{p}:\color{red}{\text{Id}_A(x,y)}$? It seems reasonable to me that given a type $\color{red}{B(x,y,p)}$ involving $\color{orange}{x},\color{orange}{y}:\color{red}{A}$,$\color{orange}{p}:\color{red}{\text{Id}_A(x,y)}$, maybe $\color{orange}{p}$ is "nice" in some way that $\color{orange}{q}:\color{red}{\text{Id}_A(x,y)}$ isn't, so a term/proof/witness in $\color{red}{B(x,y,p)}$ may not necessarily lead to a term/proof/witness in $\color{red}{B(x,y,q)}$. But why then should a term/proof/witness in $\color{red}{B(x,x, \text{refl}_x)}$, the "simplest/nicest" scenario, necessarily lead to a term/proof/witness in $\color{red}{B(x,y,p)}$? I do not see what philosophy about "identity" these definitions are trying to capture.


P.S. in video educational content about HoTT online, there's this beautiful series "Introduction to Homotopy Type Theory" by Jacob Neumann which hasn't gotten to identity elimination yet, and the HoTTEST summer school lecture 3, which covered it @19:00 in one breath. Since I am more accepting of (IIa) Leibniz equality, I found "⩧ ≅ ≡: Leibniz Equality is Isomorphic to Martin-Löf Identity, Parametrically", but this video did not discuss the J-rule. I really did look everywhere I could.

I see a lot of push from HoTT/category theory people to introduce these concepts early (e.g. to undergrads), trying to reduce technicalities and abstraction. But I think they are missing one ingredient; it is not the technicalities or abstraction that keep me away, but rather my incredulity :( Sorry if my question is too long.

3

There are 3 best solutions below

6
On

$\newcommand{\J}{\mathsf{J}}$ $\newcommand{\K}{\mathsf{K}}$ $\newcommand{\refl}{\mathsf{refl}}$ $\newcommand{\Id}{\mathsf{Id}}$

The original motivation of the $\J$ rule is essentially the same as motivating any induction rule by thinking about 'standard' elements. For instance, for the natural numbers we are given two ways of building them: zero and successor. Then the standard explanation of induction is that states that every value is built by applying one of these two constructors finitely many times.

The identity type is similar. In this case, one thinks of an inductively generated family of types, rather than a single type. It's most convenient to consider the family to be:

$$\Id_A x : A → \mathsf{Type}$$

Then this family is given a single generator:

$$\refl : \Id_A x\ x$$

which generates a value only in the portion of the family over $x$. So now, the standard explanation is that if we have $e : \Id_A x\ y$, it must actually be $\refl$, and $x$ must be $y$, and this is, allegedly, what the induction principle $J$ is trying to express. So, for some other family $F$, and a value $v : F\ x\ \refl$, we can get a value $F\ y\ e$, because actually $y$ is $x$ and $e$ is $\refl$, so $v$ is such a value.

Of course, the whole point of HoTT is that this is not actually what $\J$ ensures, much like induction on the natural numbers doesn't rule out non-standard models. And it can be fruitful to add principles (like univalence) that are motivated by the 'non-standard' models, rather than the 'standard' one (like $\mathsf{K}$).

Edit: I'm not sure if this helps, but it is potentially another aspect:

Martin-löf was trying to write rules for (constructive) sets (or maybe some kind of realizability underlying them). His rules are (I think) pretty natural from that perspective. The identity type appears to be the definition of a family of types that is trivially inhabited along the diagonal ($\Id_A x\ x$) and empty in the off-diagonal. And the $\J$ rule easily leads to a substitution rule by eliminating into a family that ignores the equality proof.

The remarkable thing is that the rules still make sense when you interpret identity types/proofs in terms of homotopy paths, as long as you leave out later attempts to 'fix' the identity type to actually behave more like set theoretic equality. This is the purpose of the $\mathsf{K}$ rule, for example. By thinking about the 'standard' interpretation of what the $\J$ rule is supposed to mean, you'd probably expect it to entail the $\mathsf{K}$ rule, because, "induction means that only generated values exist," and that is what $\K$ says, too, except specialized to diagonal values. But in fact it doesn't.

This is also part of the philosophy behind teaching things this way, I believe. Part of the 'agenda' is to show that homotopical mathematics is in many ways just as natural as set theoretic mathematics. And one way you can tell this is that someone who was trying to write down a system for set theoretic mathematics ~50 years ago accidentally wrote down rules for homotopical mathematics instead, and you just have to not add rules that force it to be set theoretic.

5
On

The induction rule for identity types is, I think, quite natural in the context of extensional type theory. See nCatLab extensional type theory article for instance.

Identity types in the context of intensional type theory have a much more complex structure, but possibly the initial intuition of Per Martin-Löf was first inherited from the extensional case ? I have searched a few months ago the initial article where identity types were mentioned to try to get an answer, but this was not successful.

The similarity between the induction rules for natural numbers and identity types is also not very appealing to me. There is of course a similarity, but it remains fuzzy in the details, when for other inductive types, the induction rules are not just similar, they can really be derived from the same meta-theoretical concept.

Identity types in the context of intensional type theory can be complicated objects. So isn't normal that the expected, desired behaviour of such complicated objects is not fully obvious at first? It is the result of experience and research (see for instance equality and dependant type theory), and it becomes natural only when you become more familiar with the objects of the theory.

1
On

You need to take a few steps back and think about what equality means logically. The fundamental principle is that identicals are indiscernible. That means:

In a context where $x$ and $y$ are variables of the same type and $\phi$ is a proposition where $y$ does not appear, assuming $x = y$ and $\phi$, we can deduce $\phi$ with some (or none, or all) appearances of $x$ replaced by $y$.

For example, imagine $x$ and $y$ are variables of type $\mathbb{N}$ and $\phi$ is the proposition "$x$ is even". Then the rule says: assuming $x = y$ and $x$ is even, we can deduce $y$ is even. Or, imagine $\phi$ is proposition "$x + 1 = x + 1$". Then the rule says: assuming $x = y$ and $x + 1 = x + 1$, we can deduce $x + 1 = y + 1$. This rule is so obvious that we almost never think about it, but if you have never even thought about it once before, it is worth pausing now to reflect on it. Or perhaps you could try using this rule to formally prove some basic facts about equality, like symmetry and transitivity.

It is a bit tricky to formalise replacing only some appearances of a variable in a formula, so we usually do it in reverse when formalising the rule:

In a context where $x$ and $y$ are variables of the same type and $\phi$ is a proposition possibly depending on $x$ and $y$, assuming $x = y$ and $\phi$ with (all appearances of) $y$ replaced by $x$, we can deduce $\phi$.

But this is precisely the proof-irrelevant form of the J rule! There is not really a formal process for "promoting" a proof-irrelevant logical rule to a proof-relevant one, but we could imagine something like this. (This may or may not be how Martin-Löf invented the rule.)

First, we need to introduce a variable $p$ of type $x = y$, because we are working in a propositions-as-types system. Then, we need to allow $p$ (as well as $x$ and $y$) to appear in $\phi$, because we are working in a proof-relevant system. Since $\phi$ is a type now, instead of assuming $\phi$, we are given terms of type $\phi$. Thus:

In a context where $x$ and $y$ are variables of the same type, $p$ is a variable of type $x = y$, and $\phi$ is a type possibly depending on $x, y, p$, given a term $t$ of type $\phi$ with $y$ replaced by $x$ and $p$ replaced by $\textrm{refl}_x$, $J (t; x, y, p)$ is a term of type $\phi$.

We also need a rule to reduce $J (t; x, y, p)$ back to its inputs:

Furthermore, $J (t; x, x, \textrm{refl}_x)$ reduces to $t$.

Voilà: this is the identity induction principle.