Bits in a coin-toss experiment

1.2k Views Asked by At

This is not homework but an actual problem.

We flip a fair coin ten times. This gives A$_1$ to A$_{10}$. Each coin toss = 10 bits. We flip another fair coin ten times. This gives B$_1$ to B$_{10}$. So we have another 10 bits.

We calculate C from A$_n$ minus B$_n$ (that is A$_1$ minus B$_1$ and so on) giving us 10 outcomes (C$_1$ to C$_{10}$).

What is the total bits of this "experiment"? Is it 20 or 30 bits? The confusion is whether C increases the bits of the experiment, or whether the fact it is based on the previous coin tosses doesn't increase it?

Update: I am trying to work out the bits of an experiment I have done that is essentially based on the coin-toss (p = $\frac{1}{2}$). This should illustrate the problem:

I have a female who says she can correctly guess the outcome of a coin toss, but only in the morning. We do 10 guesses in the morning (10 bits, this is A above) and 10 in the afternoon (10 bits, this is B above). We also calculate a total score based on morning minus the afternoon (this is C above).

Does the calculation of the total (C) add bits to the experiment over the 20 bits of the morning and afternoon (A and B)?

1

There are 1 best solutions below

9
On BEST ANSWER

Based on its entropy, the sequence C contains $15$ bits.

Each C outcome is $-1$ with probability $\frac14$ (if the A and B outcomes are $0$ and $1$ respectively), $+1$ with probability $\frac14$ (if the A and B outcomes are $1$ and $0$ respectively), and $0$ with probability $\frac12$ (if the A and B outcomes coincide).

Hence it corresponds to $H$ bits, where $H$ is the (binary) entropy of a discrete measure with weights $(\frac14,\frac14,\frac12)$, that is, $H=-\left(\frac14\log_2\frac14+\frac14\log_2\frac14+\frac12\log_2\frac12\right)=\frac32$.

Thus $10$ outcomes of type C correspond to $10\cdot H=15$ bits.

Edit: Since C is obtained from A and B, the information content of C should be less than the sum of those of A and B. Note that A contains $10$ bits, B contains $10$ bits, A and B are independent hence A and B together contain 20 bits. In the other direction, starting from C, one obtains a sequence statistically similar to A or B forgetting the signs in C, hence the information content of C should be greater than the one of A, which is 10 bits.

Edit-edit: Here is a way to produce sequences of type C with length L which makes apparent why these contain $1\frac12$ times L bits of information. Start from a long sequence D of standard $0$-$1$ bits. Transform each of the first L bits which is at $1$ into $1$ or $-1$, using a later bit in the D sequence, and keep at $0$ each of the first L bits which is at $0$. To get a C sequence of length L, one uses roughly L + $\frac12$ times L bits from D. Since D can be reconstructed from C, the information content of C of length L is the number of D bits used, that is, $1\frac12$ times L bits.