I am looking at some code to produce binary cyclic de Bruijn sequences. If $n$ is the length of the subsequence that should not be repeated more than once, it looks for a non-cyclic de Bruijn sequence of length $2^n + n - 1$ and then returns the first $2^n$ bits as the cyclic sequence.
This appears to always give a cyclic de Bruijn sequence but I don't understand why. Why are the last $n-1$ bits of the non-cyclic sequence of length $2^n+n-1$ always the same as the first $n-1$ bits?
Here is the code in Python.
def debruijn(x):
if x.find(x[-n:]) < len(x)-n: # check if last n chars occur earlier in the string
return
if len(x) == N+n-1:
print(x[:N])
return
debruijn(x+"0")
debruijn(x+"1")
n = 4
x = "0"*n
N = 2**n
debruijn(x)
Here is the output for $n=4$.
0000100110101111
0000100111101011
0000101001101111
0000101001111011
[...]
(Originally posted to https://stackoverflow.com/questions/57731516/why-does-this-de-bruijn-code-always-return-0s-for-the-last-few-bits but moved here on request.)
Let $x_1x_2\dots x_n$ be the first $n$ bits of the sequence. In this algorithm, they're actually all initialized to $0$, but that doesn't matter for us.
In the sequence of length $2^n + n - 1$, every $n$-bit substring occurs exactly once: it has $2^n$ substrings and none of them repeat. Specifically, the $n$-bit substrings $0x_1 x_2 \dots x_{n-1}$ and $1x_1 x_2 \dots x_{n-1}$ have to occur somewhere.
If neither of them occurs at the end of the sequence, then next $n$-bit substrings after them (sharing $n-1$ of their bits) are $x_1x_2 \dots x_{n-1} y$ and $x_1 x_2 \dots x_{n-1}z$ for some $y$ and $z$. But then, among the three substrings
two have to be equal, because at least two of the bits $\{x_n, y, z\}$ must be equal. The algorithm cannot produce this: when a repetition is found, it backtracks.
The only remaining option is that one of the substrings $0x_1x_2\dots x_{n-1}$ and $1x_1x_2\dots x_{n-1}$ is found at the very end of the sequence, and nothing comes after it. Therefore the last $n-1$ bits are $x_1 x_2 \dots x_{n-1}$, and the sequence is actually cyclic.
If we go further into the theory, there is a simpler version of the proof above.
A de Bruijn sequence of length $2^n + n - 1$ is an Euler tour in the $(n-1)$-dimensional binary de Bruijn graph. A cyclic de Bruijn sequence of length $2^n$ is a closed Euler tour in this graph.
In a directed graph where all in-degrees are equal to out-degrees, such as this one, all Euler tours are automatically forced to be closed: otherwise, we would leave the starting vertex more times than we entered it, and then we couldn't use up all of its edges.