Expected number of elements in intersection of two randomly sampled subsets from two different sets

673 Views Asked by At

Let's say you have two sets $A$ with $5000$ elements and $B$ with $8000$ elements, $A ∩ B$ contains $500$ elements. Let's say we randomly sample $500$ elements from set $A$, name it $A'$ and randomly sample $800$ elements for set $B$, name it $B'$ . What is the expected number of elements in the set $A' ∩ B'$ ?

Thinking in the naive way results in an answer of $50$, which is wrong. I am unable to proceed in any particular direction. Any leads would be helpful ?

1

There are 1 best solutions below

0
On BEST ANSWER

Assuming you're sampling with replacement and independently, for any $i \in A \cap B$, let $X_{i}$ be the indicator of the event that $i$ is chosen by both rounds of sampling (that is, $X_{i}=1$ if $i$ is chosen by both, and $X_{i}=0$ otherwise). Then $|A' \cap B'|=X:=\sum_{i \in A \cap B}X_{i}$, and $\mathbb{E}[|A \cap B|]=\mathbb{E}[X]=\sum_{i \in A \cap B}\mathbb{E}[X_{i}].$

You can compute $\mathbb{E}[X_{i}]$ easily as $\frac{500}{5000}\cdot\frac{800}{8000}=\frac{1}{100}.$ Since the $X_{i}$'s are identical, $\mathbb{E}[X]=|A \cap B|\cdot\frac{1}{100}=5.$