I have been trying to figure out the following problem for quite some time but have not succedded. May be I am missing something very obvious.
Let $M$ be and integer and define $[M] = \{1,\cdots,M\}$. Suppose $S\subseteq M$ with $\vert S\vert = \beta M$, for some $0< \beta <1$. Let $Y$ be a random sample of size $t$, where the sampling is done without replacement, meaning for a fixed subset $P$ of $[M]$ of size $t$, $Pr[Y=P] = \frac{1}{\binom{M}{t}}$.
I want to show that for some constant $c<1$ $Pr[\vert Y\cap S\vert \geq c\vert S\vert] \geq \frac{1}{2}$, so that $Y$ contains a constant fraction of points of $S$, with constant probability. (For eaxample if we take, say $2\beta M$ points from $[M]$, the this sample will contain a constant fraction of $S$.) The main point I want to investigate whether the ratio $\frac{t}{\vert S\vert}$ is a constant. Another way to ask the question is what is the sample size required to cover at least a constant fraction of $S$ and whether this size grows linearly with $\vert S\vert$.
My first approach is to directly bound the probability. But before we relax the problem and assume indeed $t = \gamma\beta M$, for some constant $\gamma$ and that we are interested in the event $\vert Y\cap S\vert = c\vert S\vert$. As there are $\binom{\gamma\beta M}{c\beta M}$ possible subset of $Y$ of size $c\beta M$, the required probability is $\frac{\binom{\gamma\beta M}{c\beta M}}{\binom{M}{\gamma\beta M}}$ (assume for simplicity that all the parameters are integers), but I have not been able to simplify the expression, to show that $\gamma$ is indeed a constant, possibly depends on $c$.
My second approach relies on a general result of Hoeffding (see Theorem 4 of Probability Inequalities for Sums of Bounded Random Variables), which roughly says that Chernoff type results also hold for sampling without replacement, if it holds for sampling with replacement. It is enough to take a uniform and independent sample $X$ of size $t$, where sampling is done with replacement and analyze the corresponding probability.
Now the issue with this approach is $\mathbb{E}[\vert X\cap S\vert] = \beta t$ and applying Chernoff type bound would imply that most of sample will belong to $S$, (meaning $\vert X\cap S\vert \geq c\vert X\vert$), not that most of the $S$ will belong to $X$ (meaning $\vert X\cap S\vert \geq c\vert S\vert$). I am kind of stuck here. At this point I am not quite sure if $\frac{\binom{\gamma\beta M}{c\beta M}}{\binom{M}{\gamma\beta M}}$ is the correct expression for the desired probability in the first case. My intuition is that, in the second case where $\beta$ is large, each sample will likely to be in $S$, so we need only constant time larger sample to cover at least a constant fraction of $S$. So far I have not been able to prove this.
Thanks
This question is closely related to this question .
Edit: I think the first approach works. I have incorrectly stated that the probability is $\frac{\binom{\gamma\beta M}{c\beta M}}{\binom{M}{\gamma\beta M}}$. In fact $\vert Y\cap S\vert$ is a hypergeometric random variable. Standard concentration inequality shows that $t=\frac{M}{2}$ works. In this case with high probability $\vert Y\cap S\vert \geq \frac{1}{4}\beta M$ and $\frac{t}{\vert S\vert} = \frac{1}{2\beta}$, a constant.