Existence of thin bases

204 Views Asked by At

Suppose $A$ is a finite alphabet. Let’s call a formal language $L$ over $A$ a base of order k iff $|A^* \setminus L^k| < \infty$.

The following statement is true:

If $|A| \geq 2$ and $L$ is a base of order $k$ then $|L \cap \bigcup_{i = 0}^n A^i| \geq |A|^{\frac{n}{k} + O(1)}$ where $n \to \infty$.

Proof:

Suppose $c = |A^* \setminus L^k|$. Then for sufficiently large $n$ we have $|L^k \cap \bigcup_{i = 0}^n A^i| = \frac{|A|^{n+1}-1}{|A|-1} - c$. On the other hand, it is not hard to see, that $|L^k \cap \bigcup_{i = 0}^n A^i| \leq |\bigcup_{i = 0}^n A^i \cap L|^k$. Thus $|\bigcup_{i = 0}^n A^i \cap L| \geq (\frac{|A|^{n+1}-1}{|A|-1} - c)^{\frac{1}{k}} = |A|^{\frac{n}{k} + O(1)}$.

Q.E.D.

Now let’s call a base $L$ of order $k$ thin iff $|L \cap \bigcup_{i = 0}^n A^i| = |A|^{\frac{n}{k} + O(1)}$.

Is it always true that for any finite $A$ with $|A| \geq 2$ and for any $k \in \mathbb{N}$ there exists a thin base of order $k$?

1

There are 1 best solutions below

0
On BEST ANSWER

There are no thin bases for $|A| \geq 2$, $k \geq 2$.

Let $L \subset A^*$ and define $p(n) = \frac{|L \cap A^n|}{|A^n|}$ for $n \geq 0$.

Lemma: The probability that a random string $s \in A^n$ is in $L^k$ is at most $$\sum_{n_1 + \cdots + n_k = n} p(n_1) \cdots p(n_k).$$

Proof: For a given $s \in A^n$, $s$ is in $L^k$ if and only if we can write $s = s_1 s_2 \cdots s_k$ for some $s_1, s_2, \dots, s_k \in L$. Equivalently, $s$ is in $L^k$ if we can find some $0 \leq n_1, n_2, \dots, n_k \leq n$ with $n_1 + \cdots + n_k = n$ such that when we break the string $s$ into parts of lengths $n_1, \dots, n_k$, yielding respective parts $s_1, \dots, s_k$, then each $s_i$ is in $L$. Now, for random $s$, and a fixed choice of $n_1, \dots, n_k$, the parts of the resulting partition $s_1 s_2 \dots s_k$ are independent and such that $s_i$ is uniformly distributed in $A^{n_i}$, so the probability that this partition has $s_i \in L$ for each $i$ is clearly $p(n_1) \cdots p(n_k)$. The desired upper bound then follows from a union bound.

Proposition: If $L \subset A^*$ is such that $|L \cap A^n| = o(|A|^n n^{-k})$, then $L$ is not a base of order $k$.

Proof: The given condition means that $p(n) = o(n^{-k})$. In particular, there is some $N > 0$ such that $p(n) \leq (2k^k)^{-1} n^{-k} = (2(kn)^k)^{-1}$ for all $n \geq N$. Now consider the probability that a random string $s \in A^{kn}$ is in $L^k$, for $n \geq N$. In any term $p(n_1) \cdots p(n_k)$ from the bound in our lemma, at least one $n_i \geq n$, hence $p(n_i) \leq (2(kn)^k)^{-1}$, and all other $p(n_j) \leq 1$, hence $p(n_1) \cdots p(n_k) \leq (2(kn)^k)^{-1}$. Summing over all terms, since there are at most $(kn)^k$ terms, we have that the probability that $s$ is in $L^k$ is at most $1/2$. This means that $A^{kn} \setminus L^k$ is nonempty for arbitrarily large $n$, so $L$ is not a base of order $k$.

Thus if $|L \cap \bigcup_{i=0}^n A^i| = |A|^{n/k + O(1)}$, then clearly $|L \cap A^n| = |A|^{n/k + O(1)} = o(|A|^n n^{-k})$, so $L$ cannot be a base of order $k$. This means there are no thin bases of order $k$.

Note: With a bit more work you can strengthen this to show that when $p(n) = o(n^{-1 + (1/k)})$, the bound from the lemma is $o(1)$, so the conclusion still applies.