About This Question
I'm interested in whether my proof attempt is correct and whether the techniques of constructing a smaller model by unioning in the elements that have to be there and invoking compactness works as a way of proving Löwenheim–Skolem. If they don't work, I'd like to know why not and, ideally, how I'm conceptualizing the problem wrong.
I previously attempted to prove this theorem here seven months ago. The approach in this proof attempt is completely different, so I think it qualifies as a different question. I asked a meta question about trying to prove things multiple times here.
This question contains two proofs of the downward Löwenheim-Skolem theorem. The first uses an explicit construction involving choice. The second part is a proof of the upward and downward direction that assumes compactness over arbitrary cardinals. I am only really using it as a proof of the upward direction, but it is in my opinion more elegant to structure it as an argument for both directions at once. I am much, much less certain that the compactness-based argument works. I can't think of a reason why it wouldn't work, but it feels like pulling a rabbit out of a hat. Also, I get the impression that compactness over arbitrary cardinals is a highly nontrivial result.
The executive summary of the proof attempt is as follows.
- In the downward direction, I pick a subset of the desired cardinality and consider all function-like symbols and expressions with free variables. I close my domain under application of function-like symbols and for all expressions with free variables that have a non-empty set of interpretations, I pick an arbitrary tuple out of the set of interpretations and union in its elements. Basically, I'm trying to construct a smaller model of the desired cardinality by starting off with an arbitrary subset and then seeing how much I need to grow my domain in order to satisfy a hypothetical adversary trying to prove that I picked a bad subset and then show that the elements I needed to add to satisfy my hypothetical adversary did not expand the domain so greatly that my cardinality changed.
- In the upward direction, I am basically claiming that for a Skolemized theory, my theory lacks any models if and only if it is contradictory when all the variables in my axiom are instantiated with pseudovariables (i.e. elements of my desired domain). The resulting expression can be thought of as a huge conjunction of expressions with pseudovariables and no quantifiers. I claim that compactness promises me a finite contradiction and that, via renaming, I can move this contradiction between domains of different cardinalities. Therefore if one of the infinite cardinalities lacks a model, they all do. I imagine that a typical finite contradiction in this setting will make heavy use of the Skolem functions to produce (metaphorically) inconvenient domain elements and make my finite contradiction a reality.
Notation
First a word on notation and vocabulary:
I distinguish function symbols and constant symbols. I nonstandardly refer to function symbols and constant symbols collectively as function-like symbols.
Let $\mathbb{N}_0$ refer to the standard model of the natural numbers including zero, or, equivalently, the finite cardinals.
Let $\mathbb{N}$ refer to the elements of $\mathbb{N}_0$ that are not zero.
I use the term cardinality and size interchangeably.
If $M$ is a model let $M_D$ refer to the domain of $M$.
Countable means countably infinite or finite.
A well-formed formula denoting a truth value is called a statement. A well-formed formula denoting an entity is called a term.
I use the nonstandard notation $[x_1 x_2 \cdots x_n \mathop. \varphi ]$ to endow the free variables in an otherwise open wff with an order. This will be useful later for constructing sets of tuples. For example, $[x y z \mathop. P(x, y) \land Q(y, z)]$ where $xyz$ are variables and $PQ$ are predicate symbols is a construction for producing an open term whose free variables are ordered, which I will nonstandardly call an order-bearing term. I will nonstandardly call the set of tuples of domain elements such that a given order-bearing term is true a satisfying set or somewhat more standardly the set of interpretations.
In addition, I will use the semistandard notation $\pi_1$ to denote projections out of an ordered term. For example, the expression below denotes an arbitrarily chosen element strictly less than $4$. Note that it denotes a single element that is chosen arbitrarily and does not denote a set of interpretations. This choice operator is completely implicit and lives in the background. However, since I am not making consistent choices across equal sets everywhere, I do not need the Axiom of Global Choice.
$$ \pi_1 [x \mathop. x < 4] $$
The notation $b = i(M, w)$ means that the sequence of symbols $w$ has interpretation $b$ in model $M$.
If $\xi$ is a permutation of symbols, let $\varphi[\bigcirc := \xi]$ (read $\varphi \;\;\text{permuted by}\;\; \xi$) nonstandardly refer to substituting each free variable $x$ in $\varphi$ by $\xi(x)$.
Statement of the Theorem
Here is the statement of the theorem from this handout from a course at Dartmouth.
If a countable theory $T$ has an infinite model, then it has models of all infinite cardinalities.
I am deliberately excluding the part of the Löwenheim-Skolem theorem that says that, if we have an infinite model $M$, the smaller models promised to us are elementary substructures of $M$ and the larger models promised to us are elementary extensions. I am focusing on cardinality alone. Wikipedia's article on the Löwenheim-Skolem theorem includes this additional content on structural properties of the models promised to us by the theorem.
Attempted Proof
Attempted Proof of Löwenheim-Skolem theorem
We can separate the Löwenheim-Skolem theorem into three discrete cases, depending on the cardinality of $\kappa$ and $\lambda$, let's call them the downward ($\kappa < \lambda$), equipolent ($\kappa = \lambda$), and upward ($\kappa > \lambda$) cases.
The equipolent case of the Löwenheim-Skolem theorem is trivially true because the model $M$ of size $\kappa$ also has size $\lambda$ when $\kappa = \lambda$.
Let us now consider the downward case.
Downward Löwenheim-Skolem theorem
Suppose we have a first-order theory $T$ and a structure $M$ of infinite cardinality $\kappa$.
Let $\sigma$ be the first-order signature associated with $T$.
Let $\lambda$ be an infinite cardinal strictly smaller than $\kappa$.
I show an explicit construction (explicit up to invocation of the axiom of choice) that produces $M'$. $M'$ is a model of $T$ if and only if $M$ is a model of $T$.
NOTE: I go through the initial step of picking $K_0$ in order to prevent $M'$ from being unexpectedly small through bad luck.
Since $|M_D| = \kappa$, it contains an infinite subset of cardinality $\lambda$. Let $K_0$ be an arbitrary infinite subset of cardinality $\lambda$.
Next, construct a new language $L$ out of the language associated with the model $M$, but additionally with all the elements of $K_0$ as constant symbols.
Next, consider all the order-bearing terms in $L$, call this set $W_0$
Let $W_1$ be the set of all elements of $W_0$ whose set of interpretations is nonempty.
Let $U_0$ be the set consisting of an arbitrarily chosen tuple from each element of $W_1$.
Let $K_1$ be the set of all elements of all tuples in $U_0$. In notation
$$ x \in K_1 \stackrel{\text{def $K_1$}}{\iff} \left(\exists t \in U_0 \mathop. x \in t\right) $$
$\left|K_1\right| \ge \kappa$ because each expression of the form $[x.x=k_0]$ where $x$ is a variable symbol and $k_0$ is an element of $K_0$ gives rise to a distinct element of $K_1$
$\left|K_1\right| \le \kappa$ because each order-bearing wff in $L$ gives rise to zero or one elements in $K_1$ and not all of those elements are distinct. Therefore there can't be more than $\left|L\right|$, which is countably infinite.
Next, we compute the closure of $K_1 \cup K_0$ under all the functions corresponding to function symbols and constant symbols in $M$ and all projections. I will define that concretely as follows.
Let $S_0$ be the union of $K_0$, $K_1$ and the interpretation in $M$ of all constant symbols in $T$.
Let $S_1$ be the result of applying the functions associated with function symbols in $M$ to $S_0$, and closing under the interpretation of expressions of the form $\pi_i [x_1x_2\cdots x_i \cdots x_n \mathop. \varphi]$ where $\varphi$ is a sentence constructed out of variables, predicates, and terms in $S_0$. If $\varphi$ is false in $M$, then we do not pick an element from one of the tuples in its set of interpretations. Considering each projection in turn is equivalent to picking an arbitrary tuple, and unioning in all of its elements.
For example, if $c$ and $d$ are in $S_0$ and $F$ is a binary function symbol with interpretation $f$, then $\left\{ f(c,c), f(c,d), f(d,c), f(d,d), i(M, \pi_1 [x . x=f(c, d)]) \right\}$ is a subset of $S_1$.
Let $S_2$ be the result of applying the functions associated with the function symbols in $M$ and inerpreting projections to $S_0$ or $S_1$.
Let $S_3$ and so on be defined inductively, $S_n$ consists of the intepretation of all the function symbols in $M$ and all the expressions headed by a projection with arguments and subexpressions taken from $\bigcup_{0 \le i < n} S_i$.
Define $S$ as the union of all $S_i$s. In notation
$$ S \stackrel{\text{def}}{:=} \bigcup_{i \in \mathbb{N_0}} S_i $$
We define a new model $M'$ by picking $M'_D$ to be $S$.
Every constant symbol in $M'$ has the same interpretation as it does in $M$.
Every predicate symbol in $M'$ has the interpretation formed by removing only the tuples with elements not in $S$. In notation, if $P$ is a predicate symbol
$$ t \in i(M', P) \iff t \in i(M, P) \land (\forall x \in t \mathop. x \in S) $$
Similarly, let every function symbol have the same interpretation in $M'$ that it does in $M$, but with the tuples containing elements not in $S$ removed. Note that $S$ is closed under function application by construction and therefore the interpretation forms a function. No tuples of elements from $S$ get sent outside $S$ by any function.
Now I will show that $M' \models T$.
Let $\theta$ be a closed wff in prenex normal form. Let $\theta$ be a deductive consequence of $T$. In notation, $T \vdash \theta$.
Suppose $\theta$ has no quantifiers, then the truth of $\theta$ can be determined by consulting the truth tables of the logical connectives and does not depend on the underlying model at all.
Suppose $\theta$ has a leftmost quantifier.
Suppose $\theta$ has a leftmost quantifier and it's $\exists$. If the expression is $\exists x \mathop. \varphi$ then one of the elements of $M_D$ satisyfing the order-bearing term $[x.\varphi]$ is in $M'_D$ and therefore the sentence is true. Suppose $\theta$ has a leftmost quantifier and it's $\forall$. Suppose the outermost variable symbol is $x$, so the term begins $\forall x \mathop. \cdots$. If $M'$ has an element d such that $\varphi[x:=d]$ is true, then $[x\mathop. \varphi]$ would be inhabited in $M'$ and the same element would cause $\forall x . \varphi$ to be false in $M$, which is a contradiction since $\forall x . \varphi$ is true in $M$. Since this fact is true for any wff in prenex normal form, it is also true for general wffs since all wffs in first order logic are equivalent to a wff in prenex normal form if the domain is constrained to be nonempty. The domain is constrained to be nonempty because the domain is constrained to be infinite.
Upward (And Downward) Löwenheim-Skolem Theorem
Let $T$ be a relational theory with no finite models.
Let $T_S$ be a Skolemized theory. $T_S$ is not relational, but its only function-like symbols are Skolem functions. $T$ is satisfiable if and only if $T_S$ is satisfiable.
I nonstandardly say that an $T_S$ structure $M_{11}$ and $M_{12}$ are $T$-equivalent OR predicate-equivalent if the interpretations of their predicate symbols are the same. For the notion of predicate-equivalence the interpretations of constants and function symbols does not matter.
Any model of $T_S$ can be turned into a predicate-equivalent model of $T$ by removing the interpretations of the function-like symbols, all of which are associated with the Skolemization step.
Let $X$ be an infinite set. Let $X'$ be another infinite set that is a superset or a subset.
I want to show that, $T$ has a model with domain $X'$ if and only if $T$ has a model with domain $X$.
Let $\Omega$ be the instantiation of all the axioms of $T_S$. I instantiate an axiom by replacing all the variables with pseudovariables, (elements of $X$).
I claim $T_S$ has no models of domain $X$ if and only if $\Omega$ contains a contradiction.
Suppose $\Omega$ contains a contradiction, then $T_S$ will have no models.
Suppose $T_S$ has no models, then $\Omega$ contains a contradiction. If ${T_S}$ has no models, that means every structure $M$ corresponding to $T_S$ contains an instantiated axiom which fails some axiom of $T_S$. If that is true, then the whole of $\Omega$ is contradictory.
If $\Omega$ contains a contradiction, then it contains a finite contradiction $F$ by compactness over arbitrary cardinals.
Let $\xi$ be a renaming of $X$ and let $\varphi[\bigcirc:=\xi]$ nonstandardly denote renaming all the pseudo-variables taken from $X$ in $\varphi$ using the permutation $\xi$.
We can take $F$ and determine $F[\bigcirc := \xi]$. $F[\bigcirc := \xi]$ is also a finite contradiction within $\Omega$.
If $X'$ is a superset of $X$, then the original $F$ is a contradiction in $X'$.
If $X'$ is a subset of $X$, then we can determine a permutation of $X$ $\psi$ so that all the pseudo-variables that appear in $F$ are sent to $X'$. We can do this because there are finitely many pseudo-variables in $F$ and infinitely many in $X'$. Thus $F[\bigcirc := \psi]$ is a finite contradiction in $X'$ and therefore $X'$ has no models.
For any pairs of infinite cardinals $\kappa$ and $\lambda$, one of the following is true
- $\kappa > \lambda$
- $\kappa = \lambda$
- $\kappa < \lambda$
In all of these cases, $\kappa$ is a (non-strict) superset or subset of $\lambda$, therefore the argument above applies.
Therefore, if one infinite cardinal has no models of $T$, then all infinite cardinals have no models of $T$.
Let $Y$ be an arbitrary first order theory, I define a mapping from $Y$ to an equisatisfiable relational first order theory below.
I map every constant-symbol to a unary predicate symbol.
I map every $n$-ary function symbol $f$ to an $n+1$-ary predicate symbol.
For every newly minted relation symbol, I add the following axioms:
existence of output of function.
$$ \forall x_1 x_2 \cdots x_n \mathop. \exists z \mathop. F(x_1,x_2,\cdots x_n, z) $$
Uniqueness of output of function.
$$ \forall x_1 x_2 \cdots x_n \mathop \forall u v \mathop. F(x_1, x_2, \cdots x_n, u) \land F(x_1, x_2, \cdots x_n, v) \to u = v $$
Since this translation procedure preserves satisfiability, the proof for relational theories generalizes to arbitrary theories.
I additionally remark that any theory can be made into an equisatisfiable-at-every-infinite-cardinality theory with the finite cardinals removed by adding a function symbol and a constant symbol. The constant symbol is not in the image of the function, and the function is an injection. This is directly lifted from the axiomatization of arithmetic, and the rules for $0$ and $\text{succ}$.
This completes the proof of the Löwenheim–Skolem theorem.
I use this second proof for the upward direction, but I think the argument for the downward direction is also valid.