Expected value for the number of elements in difference of two subset

91 Views Asked by At

I have a set of $n$ integers $X=\{1,...,n\}$. I select $m$ values uniformly with replacement from $X$. Let $A$ denotes this random subset and $|A|$ be the random variable for the number of distinct elements in $A$. Now, I select $m_0=E(|A|)$ values with replacement from $X$. Let $B$ denotes this random subset and $|B|$ be the random variable for the number of distinct elements in $B$. What is the expected value for the number of distinct elements in $B-A$? Is this true that $$E(|B-A|)= E(|B|)\left(1-\frac{E(|A|)}{n}\right)?$$

1

There are 1 best solutions below

0
On

Given $m$, the value of $m_0 = \mathsf{E}(|A|)$ is also fixed. For each integer $i$ in $X$, define a random variable $Y_i$ such that $$ Y_i = \begin{cases} 1 & \text{if $i \in B$ and $i \not \in A$} \\ 0 & \text{otherwise} \end{cases} $$ Then $$ |B - A| = \sum_{i= 1}^n Y_i $$ and thus $$ \mathsf{E}(|B - A|) = \sum_{i=1}^n \mathsf{E}(Y_i) $$ For each $i$, it is not in $A$ with probability $\left(\frac{n-1}{n}\right)^m$ and is in $B$ with probability $1 - \left(\frac{n-1}{n}\right)^{m_0}$. Thus $$ \mathsf{E}(Y_i) =\color{red}{\left(\frac{n-1}{n}\right)^m}\cdot\color{blue}{\left(1 - \left(\frac{n-1}{n}\right)^{m_0}\right)} = \color{red}{\left(1 - \frac{\mathsf{E}(|A|)}{n}\right)} \cdot \color{blue}{\frac{\mathsf{E}(|B|)}{n}} $$ and $$ \mathsf{E}(|B - A|) = \mathsf{E}(|B|)\cdot \left(1 - \frac{\mathsf{E}(|A|)}{n}\right) $$