Designing a sampling scheme for a weird problem

25 Views Asked by At

Suppose I have a sequence of objects in a line, one by one. Each time, I must draw 5 objects (or any integer less or equal to the total number of objects) next to each other, i.e., a window of size 5.

For some weird reasons, I need the probability of each object being sampled to be roughly similar. Ideally, I hope that the probability is exactly the same for each object, but I have an intuition that this is not doable in general. For example, if the window size is 1 less than the total number of objects, this is clearly not doable.

For an example, let's use uniform sampling to decide on the leftmost object of the size-5 window. However, doing so means that the objects on either ends are sampled less frequently.

Thanks for the help! I've thought about the problem and I'm stuck.

1

There are 1 best solutions below

2
On

This is really an extended comment:

It probably depends on what you mean by "roughly similar". If for example you had $11$ items and were selecting windows of length $3$ so the average probability of selection was $\frac{3}{11}\approx 0.2727$, you could get actual probabilities of $\frac{3}{12}=0.25$ for eight and $\frac4{12}\approx 0.3333$ for three by choosing:

  • $1,2,3$ with probability $\frac{3}{12}$
  • $3,4,5$ with probability $\frac{1}{12}$
  • $4,5,6$ with probability $\frac{2}{12}$
  • $6,7,8$ with probability $\frac{2}{12}$
  • $7,8,9$ with probability $\frac{1}{12}$
  • $9,10,11$ with probability $\frac{3}{12}$

and there would be similar patterns for other cases. With $11$ items and selecting windows of length $5$, so an average probability of selection of $\frac{5}{11}\approx 0.4545$, you could achieve you could get actual probabilities of $\frac{2}{6}\approx 0.3333$ for three and $\frac3{6}=0.5$ for eight by choosing:

  • $1,2,3,4,5$ with probability $\frac26$
  • $2,3,4,5,6$ with probability $\frac16$
  • $6,7,8,9,10$ with probability $\frac16$
  • $7,8,9,10,11$ with probability $\frac26$