I was watching this video this video "$\infty$-Category Theory for Undergraduates" by Emily Riehl, and was onboard with everything except the path induction principle for identity types (27:00 minutes in) (using Riehl's color convention):
Given any type family [(context) $\color{gray}{\Gamma}$, (terms) $\color{orange}{x},\color{orange}{y}$ (of type) $:\color{red}{A}$, (and a "proof/witness of equality") $\color{orange}{p} : \color{red}{\text{Id}_A(x,y)}$ $\vdash \color{red}{B(x,y,p)}$ (any other type we form given the data on the LHS of $\vdash$)], to produce a term of type $\color{red}{B(x,y,p)}$ it suffices to do so only in the super special case of: $\color{orange}{y}$ is $\color{orange}{x}$, and $\color{orange}{p}$ is $\color{orange}{\text{refl}_x} : \color{red}{\text{Id}_A(x,y)}$
Riehl admits that "it is confusing if you're seeing it for the first time", and compares the situation to students learning proofs by induction for the first time, where "it's very confusing for a while but after you've done ten exercises writing induction proofs you sort of start to get the hang of it". I do not doubt/dispute that it is similar for path induction. However, the one major difference, is that I believed in the axiom/principle of mathematical induction when I first learned it, but I don't believe in the path induction principle now.
In another Riehl talk "How I became seduced by univalent foundations", at 30:00 minutes in, Riehl introduces this path induction principle (also called the identity elimination rule), and says "it's, uh, ..., magic". So I think Riehl recognizes that this may be a "sticking-point", and something that people do not find very intuitive. In the above talks (in particular the "...Undergrads" one) (and also in e.g. Elimination rule for identity types in Martin-Lof Type Theory), this identity elimination rule can be made more intuitive from the homotopy interpretation. As described at HomotopyTypeTheory.org (which refers to the path induction principle as the "J-rule"),
The interpretation of propositional equalities between paths (i.e. of propositional equalities between propositional equalities) is homotopy (continuous deformation). The intuition for why J holds but K does not is that a PathFrom M has one endpoint free, which gives you additional freedom in how you can deform the path. To deform (N , P) into (M , Refl), you move the endpoint N over to M, while deforming the P into reflexivity.
However, this now begs the question of why this homotopy interpretation of some type theory should have any credence to the wider entirety of mathematics. In this MO post "Homotopy Type Theory: What is it?", Mike Shulman addresses the question of why "people are pushing univalence and constructive type theory as a new foundations for mathematics" with an excellent blogpost "From Set Theory to Type Theory". I am totally on board with what he writes in this article. In particular I think the univalence axiom is quite well motivated:
The solution to this problem is the univalence axiom, which can be concisely described as saying that “isomorphic structures are equal”.
In another n-category cafe post by Riehl, another example is raised: $\mathbb N \simeq \{x \in \mathbb Z: x\geq 0\}$ vs. $\mathbb N =\{x \in \mathbb Z: x\geq 0\}$. Anybody with a semester/year of experience in abstract algebra can appreciate this notion of formalizing "isomorphic things are equal", and moreover the idea of working in a "isomorphism-invariant" environment:
In set theoretic foundations, of course, we could already prove that any two natural numbers objects were isomorphic. In material-set theory, all we can conclude from that is that any “isomorphism-invariant property” of one natural numbers object carries over to another. Then we have to do work (or, as is usually the case, leave work to the reader) proving that any particular property we care about is isomorphism-invariant. The situation in structural-set theory is somewhat better: if a structural-set theory is presented carefully enough, then we can only say isomorphism-invariant things. Both of these approaches, however, eventually lead to dependent type theory anyway: it is the way to characterize the isomorphism-invariant properties, and the careful way to present structural set theories. Moreover, in univalent type theory we can prove the statement “all properties are invariant under isomorphism” directly, rather than only observing it as a meta-property of the theory.
So alright, perhaps I can appreciate how homotopy theory (and how it relates to being able to formalize this philosophy behind the univalence axiom "isomorphic things are equal") can be used in the wider mathematical universe. But I am still hung up on the J-rule, which is not mentioned in Shulman's overview/type-theory advertisement. I do not see what the insight of the J-rule specifically is, in the wider mathematical universe.
Question 1: Since Martin-Löf came up with his type theory decades before the homotical interpretation, there surely is some deeper/more universal mathematical idea he was trying to capture with his definitions. So what were his justifications for introducing this J-rule? ...though, I also heard that there are other rules, a K-rule and an I-rule (also by Martin-Lof?) that are no longer "accepted" today, so perhaps I should not look so hard into Martin-Lof's original motivations...
This nlab post presents an alternative/equivalent definition of the J-rule, in terms of (IIa) "Substitution of identifications preserves computations" (Liebniz's "indiscernibility of identicals"), and (IIb) "An identification identifies itself with a self-identification".
I agree with nlab when it says (IIb) is "subtle", but I don't find it "self-evident" at all. What exactly is this (IIb) fundamentally trying to say about different "proofs of identification with $x$"? That they are "homotopic" in some sense? Why is that natural/reasonable? What insight about everyday proofs does this homotopy interpretation reveal/corroborate?
Compare again the situation to the univalence axiom: it corroborates the everyday intuition/experience of mathematicians treating isomorphic things as equal, and the homotopy interpretation is fundamentally saying that the perspective/tools of homotopy and type theory allow us to work in an "isomorphism-invariant" environment, and to formalize the aformentioned intuition.
Perhaps a more concrete phrasing of the question is: from the homoptopy pictures it looks like if we fix $\color{orange}{x},\color{orange}{y}:\color{red}{A}$, the 2 paths/proofs $\color{orange}{p},\color{orange}{q}:\color{red}{\text{Id}_A(x,y)}$ may not be homotopic. But the J-rule is the statement that fixing only $\color{orange}{x}:\color{red}{A}$ and letting $\color{orange}{y}$ roam free, every $\color{orange}{p}:\color{red}{\text{Id}_A(x,y)}$ is homotopic to $\color{orange}{\text{refl}_x}:\color{red}{\text{Id}_A(x,x)}$. What phenomenon is this "2-fixed points" vs. "1-fixed point" trying to capture about these "proofs of equality" $\color{orange}{p}:\color{red}{\text{Id}_A(x,y)}$? It seems reasonable to me that given a type $\color{red}{B(x,y,p)}$ involving $\color{orange}{x},\color{orange}{y}:\color{red}{A}$,$\color{orange}{p}:\color{red}{\text{Id}_A(x,y)}$, maybe $\color{orange}{p}$ is "nice" in some way that $\color{orange}{q}:\color{red}{\text{Id}_A(x,y)}$ isn't, so a term/proof/witness in $\color{red}{B(x,y,p)}$ may not necessarily lead to a term/proof/witness in $\color{red}{B(x,y,q)}$. But why then should a term/proof/witness in $\color{red}{B(x,x, \text{refl}_x)}$, the "simplest/nicest" scenario, necessarily lead to a term/proof/witness in $\color{red}{B(x,y,p)}$? I do not see what philosophy about "identity" these definitions are trying to capture.
P.S. in video educational content about HoTT online, there's this beautiful series "Introduction to Homotopy Type Theory" by Jacob Neumann which hasn't gotten to identity elimination yet, and the HoTTEST summer school lecture 3, which covered it @19:00 in one breath. Since I am more accepting of (IIa) Leibniz equality, I found "⩧ ≅ ≡: Leibniz Equality is Isomorphic to Martin-Löf Identity, Parametrically", but this video did not discuss the J-rule. I really did look everywhere I could.
I see a lot of push from HoTT/category theory people to introduce these concepts early (e.g. to undergrads), trying to reduce technicalities and abstraction. But I think they are missing one ingredient; it is not the technicalities or abstraction that keep me away, but rather my incredulity :( Sorry if my question is too long.
$\newcommand{\J}{\mathsf{J}}$ $\newcommand{\K}{\mathsf{K}}$ $\newcommand{\refl}{\mathsf{refl}}$ $\newcommand{\Id}{\mathsf{Id}}$
The original motivation of the $\J$ rule is essentially the same as motivating any induction rule by thinking about 'standard' elements. For instance, for the natural numbers we are given two ways of building them: zero and successor. Then the standard explanation of induction is that states that every value is built by applying one of these two constructors finitely many times.
The identity type is similar. In this case, one thinks of an inductively generated family of types, rather than a single type. It's most convenient to consider the family to be:
$$\Id_A x : A → \mathsf{Type}$$
Then this family is given a single generator:
$$\refl : \Id_A x\ x$$
which generates a value only in the portion of the family over $x$. So now, the standard explanation is that if we have $e : \Id_A x\ y$, it must actually be $\refl$, and $x$ must be $y$, and this is, allegedly, what the induction principle $J$ is trying to express. So, for some other family $F$, and a value $v : F\ x\ \refl$, we can get a value $F\ y\ e$, because actually $y$ is $x$ and $e$ is $\refl$, so $v$ is such a value.
Of course, the whole point of HoTT is that this is not actually what $\J$ ensures, much like induction on the natural numbers doesn't rule out non-standard models. And it can be fruitful to add principles (like univalence) that are motivated by the 'non-standard' models, rather than the 'standard' one (like $\mathsf{K}$).
Edit: I'm not sure if this helps, but it is potentially another aspect:
Martin-löf was trying to write rules for (constructive) sets (or maybe some kind of realizability underlying them). His rules are (I think) pretty natural from that perspective. The identity type appears to be the definition of a family of types that is trivially inhabited along the diagonal ($\Id_A x\ x$) and empty in the off-diagonal. And the $\J$ rule easily leads to a substitution rule by eliminating into a family that ignores the equality proof.
The remarkable thing is that the rules still make sense when you interpret identity types/proofs in terms of homotopy paths, as long as you leave out later attempts to 'fix' the identity type to actually behave more like set theoretic equality. This is the purpose of the $\mathsf{K}$ rule, for example. By thinking about the 'standard' interpretation of what the $\J$ rule is supposed to mean, you'd probably expect it to entail the $\mathsf{K}$ rule, because, "induction means that only generated values exist," and that is what $\K$ says, too, except specialized to diagonal values. But in fact it doesn't.
This is also part of the philosophy behind teaching things this way, I believe. Part of the 'agenda' is to show that homotopical mathematics is in many ways just as natural as set theoretic mathematics. And one way you can tell this is that someone who was trying to write down a system for set theoretic mathematics ~50 years ago accidentally wrote down rules for homotopical mathematics instead, and you just have to not add rules that force it to be set theoretic.