Reservoir sampling - understanding probabilites

246 Views Asked by At

I am reading about reservoir sampling(method for selecting random sample out of some data), but cannot understand a few things about probability I came across.

The article at blog, says that if I'm picking 1000 elements from a random stream of data the probability of picking 1001 element of the stream is $1000/1001$

Quoting..

With what probability after the 1001'th step should element 1,001 (or any element for that matter) be in the set of 1,000 elements? The answer is easy: 1,000/1,001.

I'm lost, why is it $1000/1001$? Shouldn't it be $1/1001$, since every element should be equally likely?

Another article here, says the same thing in different words..

The probability of choosing any previous item was 1/N, and with N/(N+1) probability of staying that way. What does this mean?

I think this might be something really basic but I'm not every good at probabilties, and I'm working on it.