Expectation of the max number of times an element is chosen, if repeatedly randomly choosing a subset out of a set

115 Views Asked by At

Given a set of $n$ elements. For a total of $s$ times, you randomly choose exactly $m$ elements among them. (i.e. each $m$-element set has $p=\frac{1}{\binom{n}{m}}$ of being chosen) Here $n\gg m$ and maybe $s\sim\frac{n}{m}$.

If we denote $X_i$ as the number of times the ith element is chosen, I want to know about the maximum among $X_i$, i.e. $X=\max\{X_1,X_2,\cdots,X_n\}$. Specifically, I want to know either $E(X)$ when $s=O\left(\frac{n}{m}\right)$, or an $s$ such that $E(X)=O(1)$.

I don't think there is a pretty answer to this, so I am trying some approximations. I'm not sure if I can safely ignore the covariance between $X_i$ and treat them as iid binomial distributions (i.e. $X_i\sim Binom(s,p)$ where $p=\frac{m}{n}$). But even so I fail to figure out the expectation of $X$. I think there must be a better way to approximate $X_i$ but I do need help on it.

1

There are 1 best solutions below

0
On BEST ANSWER

We can get some traction here by making two unjustified approximations: first, as you say, approximating the $X_i$ as being iid binomial distributions, and second, approximating these binomial distributions as Poisson distributions. This means we want to know the expected maximum of $n$ iid Poisson distributions with parameter $\lambda = \frac{sm}{n}$. This is known and the answer is roughly (see page 3 of the link)

$$M_n \sim \frac{\log n}{W \left( \frac{\log n}{e \lambda} \right)}$$

where $W$ is the Lambert W function. You could run some numerical experiments to see if this is accurate for the range of parameters you're interested in.