Recently our school conducted the final exam in which there's the following problem (the problem is translated from the original chinese version):
We select $n$ out of $m$ labelled items (labelled from $1$ to $m$) as the sample ($m>n$). First we determine the sampling interval, which is an integer $k\leq \dfrac{m}{n}$, and then we randomly pick an item labelled $1\leq i_0\leq k$ and choose the sample to be $i_0, i_0+k, \cdots, i_0 +(n-1)k$. According to this procedure, the probability of each item being selected into the sample ( $\quad$ ).
A. depends on $i_0$. $\quad$ B. depends on its label. $\quad$ C. is not necessarily equal to one another. $\quad$ D. is equal to one another.
First of all, the problem is unrigorous in its essence: the value of $k$ should be taken as $\left[\dfrac{m}{n}\right]$, otherwise the problem is almost meaningless.
The official answer given is D, but I doubt that when $n\nmid m$, the items with labels large enough ($nk\sim m$) have zero possibility of being selected. But my teacher tells us that its a recapitulation of systematic sampling and that we should choose D at first glance just because of this. What I wonder are:
- Is the problem (after adding the $k$ condition) really a systematic sampling problem?
- If my thoughts are correct, then when we use systematic sampling, should it always be that $n\mid m$? What if unfortunately $m$ is a prime? Why use this whole method then if it causes so much trouble?
Thanks in advance for any answer which could dispel my doubts!
upd: Thanks @lulu @Jeroen Boschma for the replies. It seems true indeed that some large numbers are naturally impossible to be selected. And thus I propose my further speculation (the previous questions can be seen as archived, here are new ones):
- Is this systematic sampling process meant to decrease the randomized amount by reducing the scope of numbers you're choosing from?
- And are there any researches or ideas on good ways of choosing $k$ while preserving both randomness and simplicity? (My current conjecture is taking $k\approx \sqrt{n}$ so that the choosing-random-nubmers part is scaled down by a moderate amount)
Note: I'm not professional at probability, and maybe the question goes well more complicated into i.e., the process of picking random numbers. I'm simply avid for more insight on this seemingly commonplace prosess.