What is the expected value of this random variable?

108 Views Asked by At

Suppose, you choose $m$ times, a subset of size $k$ elements (disregarding their order) from a set of $n$ elements. The elements are integer numbers.

Considering the random variable $X$ as the cardinality of the intersection of any pair of the m subsets, what is the expected value of $X$?

I have not been able to determine it analytically, but I have done some quick simulations in Python and it's value seems to converge to $\frac{k^2}{n}$ as $m$ increases.

1

There are 1 best solutions below

0
On BEST ANSWER

Choosing $m$ subsets isn't really relevant in your question, which seems to be asking what is the expected size of intersection of two subsets of size $k$.

So let's say you have picked a subset $A$ of size $k$. To pick a second subset of size $k$, let's use the method of choosing $k$ elements sequentially. Let $X_i$ be $1$ if the $i$th element chosen is also in $A$ and $0$ otherwise (for $i=1,2,...,k$). Then $X=\sum(X_i)$ is the number of elements in the intersection of the two chosen sets. So $E[X]=\sum(E[X_i])=\sum \frac{k}{n}=\frac{k^2}{n}$.

So your result via simulation is correct. It should not depend on sizes of $k$ and $n$, provided of course $k\le n$.