I'm trying to understand the restrictions on Existential Introduction (EI) as defined by the Stanford introduction to logic. Three separate restrictions are mentioned:
- The term being replaced cannot "contain any explicitly quantified variables" (slide 8)
- The term being replaced cannot "contain any universally quantified variables" (slides 8 and 10)
- "It is important to avoid variables that appear in the sentence being quantified" (slide 9)
These restrictions seem contradictory and confusing. Can anyone clarify exactly what is and isn't allowed in EI? Thanks.
EDIT: I understand the individual examples of invalid EI given in the linked article. What I need is to codify the general restrictions formally. In other words, I need a mechanical rule that says when EI can (or cannot) be applied.
There need be no restrictions on the rule they gave depending on how we interpret the notation $\varphi[\tau]$. I couldn't find where they define this notation. It just appears without explanation in Section 6.11 slide 4.
I'll use a slightly different notation, $\varphi[\tau/x]$, which will stand for the capture-avoiding substitution1 of the term $t$ for the variable $x$ in the formula $\varphi$. Capture-avoiding substitution does substitution in a way that respects bound variables. It is a meta-logical operation on formulas and not part of the syntax of formulas themselves. It is also a partial operation that is undefined when we try to substitute a term into a formula that binds free variables in that term. That is, $(\exists x.\varphi)[f(x)/z]$ is undefined because the $\exists x$ binds a free variable in $f(x)$. We say the binder "captures" the $x$ in $f(x)$, hence "capture-avoiding". The rule can be written more formally as $(\exists x.\varphi)[t/z]$ is defined when $z=x$ or $x\notin\mathrm{FV}(t)$ where $\mathrm{FV}(t)$ stands for the set of free variables of the term $t$. In the $z=x$ case, $(\exists x.\varphi)[t/x] = \exists x.\varphi$, otherwise $(\exists x.\varphi)[t/z] = \exists x.\varphi[t/z]$ (assuming the constraint is satisfied). We have the same rules for $\forall$.
Writing the Existential Introduction using this notation gives: $$\dfrac{\varphi[t/x]}{\exists x.\varphi}$$ No side-conditions are required other than $\varphi[t/x]$ has to be defined. This roughly corresponds to their constraint that $t$ doesn't contain any variables bound in $\varphi$. It's actually a little more forgiving since $(\forall x.\forall y.\psi)[f(y)/x]=\forall x.\forall y.\psi$ even though $y$ is bound in $\forall y.\psi$.
Let's see how the examples given play out. They give two examples as warning: $$\exists x.\mathrm{hates}(\mathrm{jill},x)\qquad\text{and}\qquad\forall y.\mathrm{hates}(y,f(y))$$
Starting with the second, we'd have something like $(\forall y.\mathrm{hates}(y,x))[f(y)/x]$, but $y$ gets captured so this is undefined.
For the first formula, we have the premise $(\exists x.\mathrm{hates}(x,x))[\mathrm{jill}/x]=\exists x.\mathrm{hates}(x,x)$ and so the rule works and we validly derive $\exists x.\exists x.\mathrm{hates}(x,x)$ from $\exists x.\mathrm{hates}(x,x)$. Alternatively, we have $(\exists x.\mathrm{hates}(\mathrm{jill},x))[\mathrm{jill}/x]=\exists x.\mathrm{hates}(\mathrm{jill},x)$ which validly infers $\exists x.\exists x.\mathrm{hates}(\mathrm{jill},x)$ from $\exists x.\mathrm{hates}(\mathrm{jill},x)$. If we actually want to do something non-trivial, we'd need to do $(\exists x.\mathrm{hates}(z,x))[\mathrm{jill}/z]=\exists x.\mathrm{hates}(\mathrm{jill},x)$ which allows the valid inference of $\exists z.\exists x.\mathrm{hates}(z,x)$ from $\exists x.\mathrm{hates}(\mathrm{jill},x)$.
There are more principled ways to address this issue, such as using nominal techniques, but they require introducing a lot of machinery which is usually not worth it unless you are proving many meta-theorems about many languages with rich binding structure (as often happens in programming language theory). You can also just eliminate the problem via de Bruijn indices, but these aren't at all human-friendly and substitution is still not completely trivial.
The side-conditions on Existential Elimination are actually more sophisticated and not just a matter of performing substitution correctly. They can be made clean and clear using a different organization of the rules, but I won't go into that here.
1 This link is for capture-avoiding substitution in the lambda calculus, but it is easy to adapt to first-order logic with the quantifiers taking the place of $\lambda$.