What's the minimum guaranteed substring match between a binary string and a chosen rotation?

120 Views Asked by At

Given $n$:

  • An adversary chooses a binary string $X$ of length $n$.
  • I choose two distinct rotations of $X$, called $Y$ and $Z$, with the goal of maximizing $m$, the length of the longest prefix shared by $Y$ and $Z$. (By distinct rotations, I mean $Y$ and $Z$ shift $X$ by different amounts.)

Questions:

  1. What's the minimum $m$ I can guarantee, as a function of $n$?
  2. What's the adversary's strategy for choosing $X$ given $n$?

Weak choices of $X$ would be:

  • A string of $n$ $0$s. For any $Y$ and $Z$ I pick, $m=n$.
  • A Christoffel word (eg, $11011010$) that spreads $1$s pseudo-evenly among $0$s. For carefully-chosen $Y$ and $Z$ (eg, $10110101$ and $10110110$), all but the last two bits will match, $m=n-2$.

I'm looking for the strongest choice of $X$, with a corresponding value for $m$. Clearly $m \geq 1$, no matter how $X$ is chosen; I imagine the adversary can prevent $m \geq \log n$.

Note: there is a 1983 biology paper (https://www.pnas.org/doi/epdf/10.1073/pnas.80.18.5660) that claims without proof that the "length of the expected longest" duplicate in a random string is O($\log n$). Here, I am looking for the worst case rather than the expectation, and I am allowing for duplicates that may overlap and may wrap around.

1

There are 1 best solutions below

0
On

If $n$ is a power of $2$, the adversary can do no better than choosing the de Bruijn sequence for $X$. In that case, I can choose $Y=X$ and $Z =$ a one-position rotation of $X$ (in either direction), in which case $Y$ and $Z$ have a shared prefix length of $m= \log_2 n - 1$.