Show that the co-variance between $X_j$ and $X_k$ is $\frac {-pq}{N-1}, j \ne k$

72 Views Asked by At

An urn containing $pN$ white and $qn$ black balls, the total number of balls being $N$. Balls are drawn one by one without being returned to the urn until a certain number $n$ of balls is reached.

Let $ X_i= \begin{cases} 1&\text{if the i$_{th}$ drawn ball is white }\ \\ 0&\text{if the i$_{th}$ drawn ball is white}\ \end{cases} $

Show that the co-variance between $X_j$ and $X_k$ is $\dfrac {-pq}{N-1}, j \ne k$

Attempt: The joint probability density function of $X_j X_k$ can be computed as :

$\begin{array}{|c|c|c|c|c|} \hline Y=X_jX_k& (1\times1=1) & (1\times0=0) & (0\times1=0) & (0\times0=0) \\ \hline P(Y)& p^2& pq&pq &q^2\\ \hline \end{array}$

The co-varirance between $X_j \text{ and}\ X_k = E [ ~\{X_j - E(X_j) \} \{ X_k-E(X_k)\}~] = E(X_j X_k)-E(X_j)E(X_k)\\ = 1 \times p^2 - [p \times p ] = 0 $

Where could I be going wrong? Is there a conceptual error somewhere?

Thanks a lot for the help

1

There are 1 best solutions below

0
On

As you have noted in the comments, it is sufficient to show that the joint distributions of any pair $(X_i, X_j), i\neq j$ is identical to that of $(X_1, X_2)$. Below I present a conceptually simple way to see this. I'll use $W = pN$ as the number of white balls, and $B = qN = N - W$ as the number of black balls.

Let $s = (\underbrace{1, \dots, 1}_{W \textrm{ times}}, \underbrace{0, \dots, 0}_{B \textrm{ times}}).$ Consider sampling $X = (X_1, \dots, X_N)$ without replacement. The draw you get is just a permutation of $s$. Importantly, each of these permutations is equally likely - this follows because each ball remaining in the bag can be picked with equal probability at each step. So, if $x = (x_1, \dots, x_N)$ is a permutation of $s$, then $P( X = x) = \left( \frac{N!}{W!B!}\right)^{-1}.$

But note that if I permute an $x$ as above, I get yet another permutation of $s$. Thus, if $\pi$ is a permutation of $\{1, \dots, N\},$ then $$ P(X_1 = x_1, \dots, X_N = x_N) = P(X_1 = x_{\pi(1)}, \dots, X_N = x_{\pi(N)}). $$

In addition, permutations are bijections. Let $\sigma = \pi^{-1}$. So, in the second equation above, I may perform the change of variables $i \to \sigma(i)$ to get $$ P(X_1 = x_1, \dots, X_N = x_n) = P(X_{\sigma(1)} = x_1, \dots, X_{\sigma(N)} = x_N). $$

But $\pi$ and thus $\sigma$ were arbitrary permutations. So we get that the distribution is invriant under any permutation of the indices of the random variables. Such a sequence of random variables is called exchangable.

From this, the conclusion is easy to draw. Choose a permutation that sends $i$ to $1$ and $j$ to $2$ and marginalise. In fact the same is true for any set $S \subset [1:N]$ - $P(X_S = x_S) = P(X_{\{1:|S|\}} = X_S),$ where $X_S = (X_i)_{i \in S}$.


Aside - if the above doesn't click immediately, then an even simpler way is to first imagine a case where you have $N$ balls of $N$ different colours. The whole permutation lark should become clear. Then start making some of the colours the same.

Also, while the above is nice, it doesn't properly generalise to all Polya urns - where you add back $a$ balls of the colour you picked out before drawing again (the above is the case $a = 0$). Please look Polya urns up, the perfect time to do so is right after first working with sampling without replacement :)