Lower bound on the probability that a string has a fixed substring

59 Views Asked by At

Let us suppose we have a fixed bit string P of length $n$. We have to prove that there is a constant $k>0$, such that for any $n$, if we uniformly randomly generate a bit string $Q$ of length $k\cdot2^n$, the probability that $Q$ contains $P$ as a continuous substring is greater than 0.5, where the probability of getting a 0 is 0.5 (1 as well consequently).

My first intuition was defining a random variable $X$ as the number of occurrences of $P$ in $Q$ and using that $E[X]=(k\cdot 2^n-n+1)/2^n<k$. Then, we can represent our goal probability as $P[X\geq 1]$. However, using Markov inequality and its reverse results in either an upper bound or a lower bound too loose.

In order to use Chebyshev's bounds, we would have to calculate the variance, which seems impossible to me as it depends on $P$. (Unless one can work with bounds of variance?)

Other things I have tried are:

  • Dividing the positions where the substring could start into independent cases, defining binomial variables, and trying to bring them back together. -> The bound seemed to be too loose.
  • Using Chernoff's bounds on the aforementioned binomial variables.

Another way one could prove the statement would be to show that there are $> 2^{k\cdot 2^n-1}$ bit strings of length $k\cdot 2^n$. Tried using the Inclusion-exclusion principle, but couldn't get anywhere.

I'm trying to find a solution that does not use Markov's chain for example, as we just have to show a lower bound instead of giving the exact probability.