My question is about this Numberphile video. I'm sure many of you are aware of this channel. In it, Federico Ardila is talking about collecting football stickers.
The idea is there are 682 in total and you buy them in packs of five. Ardila deals with the question: how many stickers do you expect to have to buy to get all 682, assuming you can buy them one at a time?
He says, well if you have $i$ stickers and you want to know how many you need to buy to have $i+1$ stickers, let's call this $N_i$, then if you buy $1$ sticker there's a chance of $(682-i)/682$ you are done, and a chance of $i/682$ that you need to buy $N_i$ more stickers. So according to Ardila,
$$N_i = 1 \times \frac{682-i}{682} + N_i \times \frac{i}{682}$$
Leading to
$$N_i = \frac{682}{682-i}$$
So if you have zero cards and you want 682, you just sum up all the values, so
$$N_{total} = \sum_{i=0}^{681}\frac{682}{682-i}$$
However, Ardila claims he's worked this out on his computer and that $N_{total} = 4844$, at 9:04 in the video. However, if I work it out on mine I arrive at $N_{total} = 4560$. I'm using this snippet of Python code:
sum = 0
numCards = 682
for i in xrange(numCards):
sum = sum + numCards / (numCards - i)
print sum
I suspect that either I'm doing something wrong (which seems likely given that Ardila does this for a living and I don't), or he's taken the result of a formula that he used a computer to calculate, and did that wrong, you can see the 4844 pop up at 12:40 in the video. Specifically, he uses a formula from a paper on the Double Dixie Cup Problem written by Donald J. Newman and Lawrence Shepp.
Am I doing something wrong?
It turns out I was using integers and Python was rounding the intermediate sums. If I change my program to the below code, I get $4844.23118792$: