I am currently reading through Type Theory and Formal Proof. I see that there is some variation in the literature in the fine details on the discussion on how $\alpha$-equivalence is developed, such as either defining $\lambda$-term substitution first to define $\alpha$-equivalence or vice-versa. Thus, I'll recount the development I am working through, and post my questions at the end.
The author first uses a weaker form of substitution that doesn't account for capture by binding occurrences. Let $V$ denote the set of variables, and $\Lambda$ denote the set of $\lambda$-terms. Then for $x,y \in V$, we can let $\rho_x^y : \Lambda \rightarrow \Lambda$ be the function that takes a $\lambda$-term $m \in \Lambda$ and replaces all free-occurrences of $x$ for $y$. This can be defined recursively as follows:
- For $z \in V$, $\rho_x^y(z) = \begin{cases} z, & x \neq z \\ y & x = z \end{cases}$
- For $m, n \in \Lambda$, $\rho_x^y(mn) = \rho_x^y(m) \rho_x^y(n)$
- And for $z \in V$, $\rho_x^y(\lambda z . m) = \begin{cases} \lambda z. \rho_x^y(m) & x \neq z\\ \lambda x.m & x = z\\ \end{cases}$
Now for defining $\sim_\alpha$ on $\Lambda$, first take the rule that for $x,y \in V$, and $m \in \Lambda$, if $y$ is neither a free variable of $m$, nor has a binding occurrence in $m$, then conclude $\lambda x . m \sim_\alpha \lambda y .\rho_x^y(m)$.
That is, we may start with the set $A$ of all pairs $(\lambda x. m, \lambda y. \rho_x^y(m))$ for which $y$ does not occur free nor binding in $m$. If we let $I$ be the set of all equivalence relations $E$ on $\Lambda$ with $A \subseteq E$ and such that
- (Two-sided Compatibility) For $m, n, l \in \Lambda$, if $(m,n) \in E$ then $(lm, ln) \in E$ and $(ml, n l) \in E$
- (Variable Compatibility) For $m, n \in \Lambda$, $z \in V$, if $(m, n) \in E$, then $(\lambda z. m, \lambda z. n) \in E$.
Then $\sim_\alpha$ is defined to be $\bigcap I$, which can be verified to be an equivalence relation that itself satisfies two-sided and variable compatibility.
Then the author mentions the following equivalences are justified via the compatibility rules:
- $(\lambda x. x(\lambda z. xy))z \sim_{\alpha} (\lambda x. x(\lambda z. xy))z$
- $(\lambda x. x(\lambda z. xy)) z \sim_{\alpha} (\lambda u. u (\lambda z. uy)) z$
- $(\lambda x. x(\lambda z. xy)) z \sim_{\alpha} (\lambda z. z(\lambda x. zy))z$
I'm wondering on how one would justify these equivalences formally. For (1) of course this is by reflexivity. For the second one, I would say that $z \sim_{\alpha} z$ and with $m = x(\lambda z. xy)$, we have $u$ does not occur neither free nor binding in $m$ and $\rho_x^u(m) = u(\lambda z. uy)$. Hence by the base scenario for $\sim_{\alpha}$, followed by compatibility, we get (2) holds.
The third is where I am unsure, and I am focusing on trying to show that $\lambda x.x(\lambda z. xy) \sim_{\alpha} \lambda z. z(\lambda x. zy)$ from which the rest would follow. It seems I should try to first show that $\lambda z. xy \sim_{\alpha} \lambda x . zy$.
This leaves me to my first question. Is it true that $x \sim_{\alpha} y$ for any two variables $x$ and $y$? The clauses don't make this obviously true nor false to me. If it is, it would seem weird since by compatibility, we can say that $xx \sim_{\alpha} xy$ which doesn't seem should be true.
Assuming for now any two variables are indeed $\alpha$-equivalent, I would be able to say then that $xy \sim_{\alpha} zy$. Now, let $u \in V$ be a new variable. We can say that $\lambda u.xy \sim_{\alpha} \lambda u. zy$ by variable compatibility. Since, $z$ does not occur in $xy$ and $\rho_u^z(xy) = xy$, we can say that $\lambda z. xy \sim_{\alpha} \lambda u.xy \sim_{\alpha} \lambda u . zy$. Now, $x$ does not occur in $zy$ and $\rho_{u}^x(zy) = zy$ so $\lambda u. zy \sim_{\alpha} \lambda x. zy$, from which we can conclude that $\lambda z. xy \sim_{\alpha} \lambda x. zy$.
It follows from compatibility once more that $x(\lambda z. xy) \sim_{\alpha} z(\lambda x. zy)$, but to finally conclude what I want, I would need to know how to put $\lambda x$ in front of the first $\lambda$-term, and $\lambda z$ in front of the second.
I am unsure how to do this since even with the trick of introducing new intermediate variables, the free occurrence of the $x$ and the $z$ in the front of the respective terms makes it difficult to switch back.
Can anyone give hints? Any assistance is appreciated.
EDIT: I just read a litter further that distinct variables alone are indeed not $\alpha$-equivalent. It seems that any $\lambda$-term that has no bound or binding occurrences of a variable would only be $\alpha$-equivalent to itself, and I think I have a proof idea for that. I still don't know how to show the above equivalences though.
It seems to me, that with this def. of alpha-equivalence, we can only replace bound variables, but not free ones. We can prove this by induction on $\sim_\alpha$. As you show, it would lead to unxpected results, if it were allowed.
Your proof of equivalence (2) is correct. Now for the third one, $(\lambda x. x(\lambda z. xy)) z \sim_{\alpha} (\lambda z. z(\lambda x. zy))z$. Because the terms $\lambda z. xy$ and $\lambda x. zy$ "differ" in free variables, they are not alpha-equivalent. So this approach fails. But using a "temporary" third variable and transitivity works. We have \begin{align*} \lambda x. x(\lambda z. xy) &\sim_\alpha \rho_x^u(\lambda x. x(\lambda z. xy)) \\ &= \lambda u. u(\lambda z. uy) \\ &\sim_\alpha \lambda u. u(\rho_z^x(\lambda z. uy)) \\ &= \lambda u. u(\lambda x. uy) \\ &\sim_\alpha \rho_u^z(\lambda u. u(\lambda x. uy)) \\ &=\lambda z. z(\lambda x. zy), \end{align*} where I implicitly used the compatibility rules in the second $\sim_\alpha$.