Proving that no term is an initial segment of another term.

432 Views Asked by At

The Question:

If $s$ and $t$ are strings, we say that $s$ is an initial segment of $t$ if there is a nonempty string $u$ such that $t = su$, where $su$ is the string $s$ followed by the string $u$. For example, $KUMQ$ is an initial segment of $KUMQUAT$ and $+24$ is an initial segment of $+24u - v$.

The task at hand is to prove, by induction on the complexity of $s$, that if $s$ and $t$ are terms, then $s$ is not an initial segment of $t$.

[Suggestion: The base case, when $s$ is either a variable or a constant symbol, should be easy. Then suppose that $s$ is an initial segment of $t$ and $s := f t_1 t_2 t_3\dots t_n$, where you know that each $t_i$ is not an initial segment of any other term. Look for a contradiction.]


What I have so far:

For bookkeeping, let us first use a definition for terms. A term of $L$ is a nonempty finite string $t$ of symbols from $L$ such that either:

  1. $t$ is a variable
  2. $t$ is a constant symbol, or
  3. $t := f t_1 t_2 t_3\dots t_n$, where $f$ is an n-ary function symbol of $L$ and each of the $t_i$ terms is a term of $L$. This is a recursive definition.

Base Case:

We begin our inductive proof with the base case, as you would expect. What this looks like, is actually, considering the case where $s$ is a variable or a constant symbol.

  • Suppose that $s$ is a variable. Then, $t := su$, where $u$ is non-empty, can only mean that $t$ is a variable term followed by some other sequence $\in L$. However, since there is no function symbol preceding $s$, this is not possible, since $t$ too is a term. Therefore, if $s$ is a variable, and $t$ is a term, $s$ can not be an initial segment of $t$.

  • The case for when $s$ is a constant symbol is identical to the case above.

Inductive Hypothesis:

Let the statement of theorem we are trying to prove be that if $s$ and $t$ are terms, then $s$ is not an initial segment of $t$. We define the induction property $P$ such that $P$ holds for term $s$ if $\forall i$, $s$ is not an initial segment of $t_i$ (where $t_i$ is a term) i.e. including a non-empty string $u$, $t_i := su$ is not possible, and conversely, $t_i$ is not an initial segment in $s$.

Inductive Step:

The inductive step of a proof by induction on complexity of a term takes the following form: Assume that $s$ is a term by virtue of clause (3) in the definition of terms above. Also assume that the statement of the theorem is true when applied to the terms $t_1, t_2, \dots, t_n$. Suppose, towards contradiction, that $s$ takes the form $f t_1, t_2, \dots, t_n u$ and is indeed an initial segment for $t$. This implies that $t$ must, by definition of an initial segment, must begin with $f t'_1, t'_2, \dots, t'_n$.

Since we said that:

$$t = su$$

It must be that:

$$f t'_1, t'_2, \dots, t'_n = f t_1, t_2, \dots, t_n u$$

We can then get rid of the $n$-ary function $f$ to achieve the following:

$$t'_1, t'_2, \dots, t'_n = t_1, t_2, \dots, t_n u$$


How to proceed..?

I got some leads from this answer while I was writing this post, but it seems like my answer is missing something and the aforesaid didn't really affirm my understanding or confidence in the correctness of my answer so far. Moreover, I'm looking for a bit more intuitive, and dare I say, simple-worded explanation of how to carry the aforesaid induction forward.

2

There are 2 best solutions below

1
On BEST ANSWER

Very very long comment

Yes, KUMQ is an initial segment of KUMQUAT.

Regarding the arithmetical examples, you have to use PREFIX notation, i.e. you have to compare $+24u$ with $−+24uv$. Both + and − are binary: $+(24,u)$ and $−(+(24,u),v)$.

The idea is quite simple... but the formal proof seems complex than it really is.

Consider a simple example regarding the language of arithmetic with two binary function symbols: $+,×$, to be used prefix form.

Having done the base cases with variables and constants, i.e. single-symbol terms, we have to consider the "complex" ones: $fx_1,…,x_n$.

Consider three examples:

$t_1 :=+x,y$ and $t_2 :=×x,y$ and $t_3 :=+x,z$.

What happens when we "compare" $t_1$ and $t_2$? (the case when one of the terms does not have an initial function symbol gives us an immediate answer: the corresponding term is ill-formed).

We start comparing the two function symbols: $+$ vs $×$. They are different, so the answer is immediate: neither $t_1$ is a proper initial segment (an initial substring) of $t_2$ nor $t_2$ is of $t_1$. End of games.

Consider now $t_1$ and $t_3$: the function symbols are the same; thus, we can move on. The first pair of terms: $x$ and $x$, are the same; thus we move on. The last pair is: $y$ and $z$. Now they are different, thus we can conclude that no one is a proper... of the other.

What is left: the hypothetical case: $t_1:=+x,y$ and $t_4:=+x,y,u$ with $u$ nonempty.

We will prove that this case cannot happen, unless we infringe the rules of syntax. In this case, when we have checked the pair $y$ and $y$ we have verified that $t_1$ actually is a proper initial segment of $t_4$, because up to now the two strings are identical and $t_1$ is shorter than $t_4$.

But we have also the condition that $t_1$ and $t_4$ are terms, and here we need the syntax rules. term $t_4$ starts with $+$, which is binary, and in $t_4$ we have three arguments following $+$.

Thus, either $t_4$ is not a well-formed term, or it is not possible that $u$ is notempty.

But the assumption is that $t_4$ is a term and thus (if $P \lor Q$ and $\lnot P$; therefore $Q$) it is not possible that $u$ is notempty, i.e. $u$ must be empty.

2
On

Found this proof by author Kunen in ... don't know at the moment, but will try to find out later.


Induction on length of term $t$. We prove simultaneously:

  • No proper initial segment of a term is a term

  • If $t$ has first symbol $f$ of arity $n$ then unique terms $t_1,\dots,t_n$ exist with $t=ft_1\cdots t_n$.

In this context variables and constants have arity $0$. We assume that both bullets hold for terms that have smaller length than term $t=ft_1\cdots t_n$. Then it is enough to prove that they also hold for $t$.

Let $t'$ be any term which is an initial segment (possibly not proper) of term $t=ft_1\cdots t_n$. Terms are non-empty strings hence $t'$ starts with $f$. So we must have $t'=ft_1'\cdots t_n'$ where every $t_i'$ denotes a term. Then we must have $t_1=t_1'$ because otherwise one would be a proper initial segment of the other (contradicting induction hypothese first bullet). Likewise we can prove $t_i=t_i'$ applying induction on $i$: if $t_j=t_j'$ for every $j<i$ then $t_i$ and $t_i'$ start with the same symbol of $t$ so $t_i=t_i'$ because otherwise one would be a proper initial segment of the other.

Now we conclude that $t=t'$ and have established both bullets to be true for $t$.


Nice illustration of the profit of "proving more than you want at first hand" by induction proofs. Your hypothese is stronger.