Bit strings and probability

516 Views Asked by At

Given a bit string of length $n$, I should develop a probabilistic algorithm that answers one of the following questions:

  1. Does the bit string have more zeros than ones?
  2. Does the bit string have more ones than zeros?
  3. Does the number of zeros (/ones respectively) lies between $0.4n$ and $0.6n$?

The probability that the answer is correct should be at least $0.99$. Notice that the algorithm has to answer only one of the three questions for a given bit string, and it does not always have to be the same question it answers. The algorithm should run in $O(1)$.

Generally, my method would be to choose some sample set of $k$ bits of the bit string at random and then approximate the ratio with this sample set, which yields the desired probability when choosing the size accordingly. But since the runtime constraint is that tight, I don't really think that I could proceed with this method. If I could somehow get a bound for the ratio that doesn't depend on $n$, I could get a constant runtime, but I don't really know how to do this. I thought about letting the ratio be $0.5$ which would be the expected ratio for a randomly chosen bit string, but I don't think that this approach would be valid. Any ideas how one could tackle this problem?

Edit: I am not allowed to use the normal distribution, the problem is solvable without using it.

2

There are 2 best solutions below

1
On BEST ANSWER

Taking $N$ random samples among the $n$ bits produces i.i.d. Bernoulli variables $X_1,...,X_N$ with values being the bit and parameter $p$ (the real frequency of ones). The sum $S=X_1+\cdots + X_N$ is then binomial with parameters $(N,p)$ and your estimated frequency is $R=S/N$.

A natural strategy is then to say that you are in case 1,2 or 3 when $R\in[0,0.45)$, $R\in (0,55,1]$ and $R\in [0.45, 0.55]$, respectively. Your conclusion is certainly right if $|R-p|<0.05$ so in order to be right with probability 0.99 it suffices to have $P(|R-p|\geq 0.05) \leq 0.01$

To turn this into a condition on $N$, note that $E(R)=p$ and ${\rm Var}(R) = p(1-p)/N$. Then by Bienaymé-Tchebychev: $$ P(|R-p|\geq 0.05) \leq \frac{p(1-p)/N}{0.05^2}\leq 0.01$$ which is satisfied for any $p$ provided $N\geq \frac14 \times 20^2 \times 100 = 10000$.

6
On

Looking at a signle uniform-randomly chosen bit of the string is a bernoulli distribution with (unknown) parameter $p=\frac{\text{# ones}}{n}$. You can sample this distribution as often as you want, completely independent of $n$ (dont worry about choosing the same bit multiple times, thats not a problem).

The the question then is: how many times do you have to sample a bernoulli distribution in order to determine wheather $p\in[0.4,0.6]$ with $99%%$ confidence?

This is answered for example in this qeustion. (for practical purposes, you probably want to approximate the binomial distribution by a normal one. Working directly with binomial gets somewhat complicated).

In any case it should be clear that nothing depends on $n$, so the runtime of the algorithm is $O(1)$.

(this of course assumes that generating a random number between $1$ and $n$ only takes $O(1)$ time)