Permutation Algorithms

154 Views Asked by At

Take a truly random binary string of say $64$ bits , all the ones and zeros are considered IID variables. Now perform the Fisher-Yates shuffle on this string to rearrange the bits. Are the bits still considered to be IID variables after shuffling?

1

There are 1 best solutions below

7
On BEST ANSWER

$\DeclareMathOperator{\be}{Be}\DeclareMathOperator{\bi}{Bi}$

At first glance I would have said they are, because Fisher-Yates algorithm doesn't care for the values, only their positional index. Of course, this isn't a proof, just intuition.

But then I thought: if I know the original string, then the individual bits of the shuffled one can't be independent of each other. In the same way, if you draw something from a hat without replacement, you're bounded by what's left in the hat.

So I did the math. Let's consider the individual $n$ (64) bits, their values is represented by a random variable $X_i\sim\be(p)$[Bernoulli], where $p$ is the probability of $X_i=1$, in this case $p=0.5$.

The total number of 1s in the string, $S=\sum_0^n X_i$, is also a random variable $S\sim\bi(n,p)$[Binomial]. In particular we have $$ P(S=k)=\binom{n}{k}p^k(1-p)^{n-k}\qquad 0\le k\le n $$

Now let's consider the individual bits of the shuffled string and let's call their values $Y_i$. If we knew the value of $S$, we could easily calculate the probability of $Y_i=1$. That's because Fisher-Yates yields an unbiased shuffle, every sequence is equally likely, so we just divide the number of favorable cases by the number of possible cases $$ P\bigl(Y_i=1\mid S=k\bigr)=\frac{\binom{n-1}{k-1}}{\binom{n}{k}}=\frac{k}{n} $$

Knowing the conditional probability $P(Y_i=1\mid S=k)$ we can find \begin{align} P(Y_i=1)&=\sum_{k=0}^n P\bigl(Y_i=1\mid S=k\bigr) P(S=k)=\sum_{k=0}^n\frac{k}{n}\cdot P(S=k)=\sum_{k=1}^n\frac{k}{n}\cdot P(S=k)\\ &=\sum_{k=1}^n\frac{n}{k}\binom{n}{k}p^k(1-p)^{n-k}=\sum_{k=1}^n\frac{\binom{n-1}{k-1}}{\binom{n}{k}}\binom{n}{k}p^k(1-p)^{n-k}\\ &=\sum_{k=1}^n\binom{n-1}{k-1}p^k(1-p)^{n-k} \end{align}

Applying index substitution $k=j+1$ \begin{align} P(Y_i=1)&=\sum_{j=0}^{n-1}\binom{n-1}{j}p^{j+1}(1-p)^{n-j-1}=p\sum_{j=0}^{n-1}\binom{n-1}{j}p^j(1-p)^{n-1-j}\\ &=p\sum_{j=0}^{m}\binom{m}{j}p^j(1-p)^{m-j}=p\bigl(p+(1-p)\bigr)^m=p \end{align}

Therefore $Y_i\sim\be(p)$. Not only bits in the shuffled sequence are IID random variables, but they have the same distribution of the original ones.