Why does Skolemming not preserve validity?

1.6k Views Asked by At

I'm wondering what exactly is meant when people say "Skolemization preserves satisfiability but not validity". I'm having trouble wrapping my brain around it because I think of Skolemization, when considered as an inference rule, to simply "be an okay thing to do" when working with a second-order formula.

Skolemization is said to preserve satisfiability but not validity. What exactly do satisfiability and validity mean in the context? Does "co-Skolemization" (dual to Skolemization as a second-order equivalence) have similar properties as Skolemization?


Skolemization, as far as I can tell, is based on the equivalence between the first-order formula (and therefore degenerate second-order formula)

$$ \forall x_1 \cdots x_n . \exists y . P(x_1, \cdots, x_n, y) $$

and the following second order formula, that moves the $\exists$ in front of the $\forall$.

$$ \exists f. \forall x_1 \cdots x_n . P(x_1, \cdots, x_n, f(x_1, \cdots, x_n))$$

It makes perfect sense when the expression is thought of as a game. In the first expression, the opponent first makes $n$ moves and then the player makes 1. In the second expression, the player moves first but commits to a response function of sorts rather than committing to a single move. That's not a proof, but it seems like good intuitive justification for why rewriting an expression like this is okay in the first place.

It seems to me that performing this translation at all only makes sense if we're talking about a closed expression. That is, $P(\cdots)$ must have no free variables or, equivalently, $P$ must be an $n+1$-place predicate (primitive or defined).


In propositional logic, satisfiability and validity can be though of as describing what happens to free variables. All variables are free because there's no way to bind them.

$ P \lor P $ is satisfiable because the assignment $ \{ P = \top \} $ satisfies it.

$ P \lor \lnot P $ is a tautology for obvious reasons. I don't know whether it's right to say $ P \lor \lnot P $ is valid.

An inference rule that maps $P$ to $P \lor Q$ is valid (disjunction introduction).

A rule that maps $P$ to $P \land Q$ preserves satisfiability. We can take any old assignment $\mu$ and construct $\mu \cup \{ Q = \top \}$ which satisfies the new expression.

In the context of Skolemization, are the meanings of satisfiability and validity similar?


Let "co-Skolemization" (probably has a real name) be the translation from

$$ \exists x_1 \cdots x_n . \forall y . P(x_1, \cdots, x_n, y) $$

to

$$ \forall f . \exists x_1 \cdots x_n . P(x_1, \cdots, x_n, f(x_1, \cdots, x_n)) $$

That seems just as legitimate as Skolemization, considered as an inference rule in a second-order setting. There's a caveat here that in a first-order theory definitions are provided for all fully saturated combinations of function symbols and arguments. Therefore "co-Skolemization" definitely takes us out of first-order land, but I don't know how relevant that is.

4

There are 4 best solutions below

0
On BEST ANSWER

A sentence is valid if it is true in every interpretation of its logical language.

A sentence is satisfiable if it is true in some interpetation of its logical language.

Since Skolemization adds new symbols to the logical language, I think it is useful to introduce concepts of relative validity and satisfiability:

Suppose $\mathcal L\subseteq \mathcal L'$ are logical languages, and $\mathfrak A$ is an interpretation of $\mathcal L$. Then,

  • An $\mathcal L'$-sentence is valid relative to $\mathfrak A$ if it is true in every extension of $\mathfrak A$ to an interpretation of $\mathcal L'$.
  • An $\mathcal L'$-sentence is satisfiable relative to $\mathfrak A$ if it is true in some extension of $\mathfrak A$ to an interpretation of $\mathcal L'$.

(Beware that "relative" validity and satisfiability is probably not standard terminology; I made it up for the purpose of this explanation).

Now, when we Skolemize an $\mathcal L$-sentence $\varphi$, we get an extended language $\mathcal L'\supseteq \mathcal L$ and an $\mathcal L'$-sentence $\varphi'$. This trades truth for satisfiability in the sense that

For every interpretation $\mathfrak A$ of $\mathcal L$ it holds that $\varphi$ is true in $\mathfrak A$ if and only if $\varphi'$ is satisfiable relative to $\mathfrak A$.

In other words, if $\varphi$ is true in $\mathfrak A$, then we can find some interpretation of the Skolem functions that makes $\varphi'$ true, and vice versa. But we have no guarantee that every possible interpretation of the Skolem functions will preserve the truth of $\varphi$ -- indeed, usually they won't (as Bram28's counterexample shows).

This means that if we just have $\varphi$ and is not looking at a particular $\mathfrak A$, the Skolemization preserves validity:

  1. $\varphi$ is satisfiable $\iff$
  2. There is some interpretation of $\mathcal L$ where $\varphi$ is true $\iff$
  3. There is some interpretation of $\mathcal L$ which extends to some interpretation of $\mathcal L'$ where $\varphi'$ is true $\iff$
  4. There is some interpretation of $\mathcal L'$ where $\varphi'$ is true $\iff$
  5. $\varphi'$ is satisfiable.

If we try to replicate this argument with "valid" instead of "satisfiable", we get stuck after step 3:

  1. Every interpretation of $\mathcal L$ extends to some interpretation of $\mathcal L'$ where $\varphi'$ is true.

Whereas the original (3) had two "some" that we could collapse to one in (4), here we have a combination of "every" and "some". And that cannot be collapsed to a simple statement about interpretations of $\mathcal L'$ and $\varphi'$ -- in particular it is not equivalent to $\varphi'$ being valid.

0
On

When you Skolemize, you drop the existential quantifier in front of $f$. Every structure that is a model of the original formula can be extended by providing an interpretation of $f$. Such an extension may or may not be a model of the Skolemized formula. Even though the original formula may be valid, the Skolemized formula doesn't have to be. What is guaranteed is that the two formulae are equisatisfiable.

7
On

When the existentials that are being removed in the process of Skolemization are not preceded y universals, you simply use a new constant for the respective variables.

As such, consider the formula:

$$\exists x (P(a) \lor \neg P(x))$$

This is a valid formula, since in any domain there is always an object that will satisfy the formula $P(a) \lor \neg P(x)$, namely whatever object $a$ refers to.

However, if we Skolemize, we get:

$$P(a) \lor \neg P(b)$$

which is not a valid formula, since there is a domain and interpretation where this statement is false: make sure that $a$ and $b$ refer to different objects, and where the object that $a$ refers to does not have the property expressed by $P$, while the object referred to by $b$ does have that property.

0
On

Skolemization for first order formulas merely preserves satisfiability because, in general, the wff will not be true in every interpretation of the expanded language with new non-logical symbols.


Two and a half years ago when I wrote this, I did not really understand what a model was. That's also why this questions makes no reference at all to the semantics of first and second-order logic. The question does talk about the semantics of propositional logic, though, but it's not connected to the rest of the question. At the time, I thought of closed statements in higher order logic as inherently true or false in some kind of absolute sense, so naturally the way to handle free variables was to move to the higher-order setting and add quantifiers to bind all the free variables.

The reason why Skolemization doesn't preserve validity in general is because it introduces new, nonlogical symbols into the language. These are functions, constant symbols, and relations. If I skolemize $ \exists x \mathop. \phi(x) $ and get $\phi(c)$ where $c$ is a fresh constant, then it is not the case that an arbitrary value of $c$ will preserve the truth of the statement.

Additionally, Skolemization is not thought of as taking a first order formula with some non-logical symbols, translating it into an equivalent second-order formula by adding explicit quantifiers for the non-logical symbols and making them bound variables, transforming the second order formula in a truth-preserving way, and then converting it back to a first order formula.

Although, if you do view it though that lens and do translate the first-order sentences to their corresponding second-order sentences without non-logical symbols, (first order) Skolemization preserving validity would imply the following, which is not true.

$$ \textbf{WRONG:}\;\; \forall x_1 \cdots x_n \mathop. \exists y \mathop. \phi(x_1, \cdots, x_n, y) \;\; \text{implies} \;\; {\huge\forall} f \mathop. \forall x_1 \cdots x_n \mathop.\phi(x_1,\cdots,x_n, f) $$

Rather, what we do have is the following:

$$ \forall x_1 \cdots x_n \mathop. \exists y \mathop. \phi(x_1, \cdots, x_n, y) \;\; \text{implies} \;\; \exists f \mathop. \forall x_1 \cdots x_n \mathop.\phi(x_1,\cdots,x_n, f) $$

The left-hand side is presumed to be valid because that's what it means to take it as a premise and $\phi$ might have free variables. The right hand side, however, potentially has free variables in $\text{FV}(\varphi) \setminus \{f\}$. Those free variables are implicitly universally quantified because the implication as a whole is valid in second order logic, but $f$ is explicitly existentially quantified. Thus Skolemization preserves satisfiability (assuming that the domain is non-empty). We have to give up the implicit universal quantification for the variables in $\text{FV}(\varphi) \setminus \{f\}$ and downgrade it to implicit existential quantification.