A puzzle involving independence of two random variables

65 Views Asked by At

First, consider sampling an $n$-bit string $x \in \{0, 1\}^{n}$ uniformly at random. Let $x_i$ be the $i^{\text{th}}$ bit of $x$.

Now, generate two output strings.

For each string:

$1$. First, choose a position $m \in \{1, 2, \ldots, n\}$ uniformly at random.

$2$. Then, for some fixed $k = \mathcal{O}(\log n)$, choose $k$ positions uniformly at random excluding the $m^{\text{th}}$ position.

Note that there are ${n-1}\choose{k}$ possible choices of $k$ positions, excluding the $m^{\text{th}}$ position, and by uniformly at random, I mean choosing any position with probability $\frac{1}{{n-1}\choose{k}}$.

$3$. Then, fill the $m^{\text{th}}$ position with a bit obtained by tossing a biased coin (that gives $0$ with probability $1/4$ and $1$ with probability $3/4$).

$4$. Then, fill the $k$ positions, that we have chosen, with a $k$ bit string $\{0, 1\}^{k}$ chosen uniformly at random.

$5$. Finally, fill the remaining $n - k - 1$ positions with the corresponding bits the string $x$ has in those positions.

Let $Y_1, Y_2$ be the two random variables corresponding to the two strings. Note that each of these output strings overlap with $x$ on at least $n- k - 1$ positions.

Here's what I couldn't prove. Are $Y_1$ and $Y_2$ independent? If not, are they at least uncorrelated (that is, the covariance is $0$)? You can assume $n$ to be sufficiently large.

I have an intuition that they must be independent --- since, even if we know one of them, as the procedure is very "random," I don't see any way in which I can get any information about the other.

2

There are 2 best solutions below

5
On BEST ANSWER

These are definitely not independent, and appear likely to be correlated. Let $Y_1 = y_1y_2 \cdots y_n$ and $Z_1 = z_1z_2 \cdots z_n$ be the two strings. If they were independent, we would have $P(y_1 = z_1) = \frac 12$. However, $P(y_1 = x_1)$ is at least $\frac{n-k-1}{n} = 1-\frac{k+1}{n}$ (the probability the first bit isn't chosen in the randomization procedure) and similarly $P(z_1=x_1) \ge 1-\frac{k+1}{n}$. Since the randomization procedure is done independently, the probability the first bit is not chosen either time is $(1-\frac{k+1}{n})^2$ and hence $$P(y_1=z_1) \ge P(y_1=x_1,z_1=x_1) \ge \left(1-\frac{k+1}{n}\right)^2.$$ Since $k = \mathcal O(\log n)$, for $n$ large this tends to $1$ and in particular is greater than $\frac 12$, so we conclude $Y_1$ and $Z_1$ are not independent.

1
On

They don't appear to be independent. If they were independent, then the distribution of the Hamming weight of $Y_1$ conditioned on $Y_2$ would simply be the distribution of the Hamming weight of $Y_1$, which is a binomial distribution with mean $\frac n2$ and variance $\frac n4$; however, under this construction, the Hamming weight of $Y_1$ is constrained to be within $O(\log n)$ of the Hamming weight of $Y_2$, which bounds the variance by $O(\log^2 n)$ even if the mean happens to be correct.