MOTIVATION
I am considering investing a significant amount of money into a raffle. In order to decide the number of entries I purchase, I would like to find probability distributions for the number of prizes I will win with respect to the number of entries I purchase.
HOW THE RAFFLE WORKS
Total entries: 1000
Winning entries (# of prizes): 20
How it actually works is in 20 rounds of 50 entries.
- Entries 1-50 have a 1/50 chance to win prize 1
- Entries 51-100 have a 1/50 chance to win prize 2
...
- Entries 951-1000 have a 1/50 chance to win prize 20
The entry numbers are purchased in order, so technically if I can get entries 1-50 then I have a 100% chance to win prize 1. However, I don't expect I will be able to do this since many people will be trying to buy entries at the same time. For simplicity, perhaps we can just assume that my entries will be evenly distributed across all 20 rounds (see BONUS below for my thoughts on how this change impacts the solution and please correct me if I am wrong).
INITIAL THOUGHTS
From some quick research I think the estimate for my odds of winning ONE prize is approximately like this:
1 - [ (1000-n) / 1000 ]^20
where n = number of entries I purchase
WHAT I WANT TO KNOW
What I actually want is how to calculate the probability distribution of the number of prizes I win. So not just whether I win 1 prize or not.
Given n where n is the number of entries I purchase, I want to know the average (mean) number of prizes I should expect to win and the surrounding distribution. This way I can decide my risk tolerance and choose how many entries (n) it is worth it for me to buy.
BONUS
I mentioned we can simplify the problem to assume my entries will be even distributed across all 20 rounds, but I am curious what the optimal strategy would be if I could choose my entry numbers.
For example, if n = 100 entries, is it best to buy entries 1-100 and have a 100% chance to win 2 prizes? Or would having a more even distribution be better. For example, having 5 entries in each of the 20 rounds ?
In other words, I could have:
- 100% chance to win in 2 rounds (win 2 prizes) and 0% chance to win in the other 18 rounds
- 10% chance to win in all 20 rounds
My understanding is that in both cases my expected number of wins is 2. The difference is that in the first case it is guaranteed whereas in the second place I could get lucky and win more or unlucky and win less. Correct?
Extrapolating from that, it seems like the more evenly distributed the entry numbers are across rounds, the more uncertainty in the number of prizes I will actually win. However, the expected number (mean) of the distribution should always be the same. Is this true?
Generally, you are correct in that the expected number of the distribution would more or less be the same. Obviously, going for a split in each is a high risk, high return probability.
The thing is, as you stated earlier, there is no way you will get a sure 100% for both raffles 1 and 2. Therefore, I estimate the highest probability you will get for 1 individual raffle is about 50%, although this could widely vary.
5 Tickets in 20 Raffles
Now, for some math. Let's calculate the probability you get less than 2 wins when investing 5 in each raffle.
For 1 win, it's $\binom{20}{1} \cdot (\frac{1}{10})^1 \cdot (\frac{9}{10})^{19} =$ 27.017%.
And for 0, it's 12.158%.
Adding them up, we get the total probability as 39.175%.
The probability of you getting 2 when investing 5 in each is 28.518%., through a similar concept.
Now, to calculate the probability of getting more than 2, we just add the probabilities from 0 to 2 and subtract that sum from 1.
The probability is 1 - 0.67333 = 32.667%.
Summing everything up,
The probability of getting less than 2 wins is 39.175%.
The probability of getting exactly 2 wins is 28.518%.
The probability of getting more than 2 wins is 32.667%.
As you can see, it's actually a larger chance of getting under 2 wins than above.
10 Tickets in 10 Raffles
Now, we calculate the probabilities for when you enter 10 raffles with 10 tickets each.
Similar reasoning, but just change up the numbers a bit.
For 1 win, it's $\binom{10}{1} \cdot (\frac{1}{5})^1 \cdot (\frac{4}{5})^9 =$ 26.844%.
And for 0, it's 10.737%.
Therefore, getting under 2 wins is 37.581%.
Getting exactly 2 wins is 30.199%.
And getting more than 2 wins is 1 - 0.67780 = 32.22%.
Summing everything up,
The probability of getting less than 2 wins is 37.581%.
The probability of getting exactly 2 wins is 30.199%.
The probability of getting more than 2 wins is 32.22%.
As you can see, investing 5 tickets in 20 raffles gives you a higher chance of getting less than 2 wins, but also gives you a higher chance of getting more than 2 wins. However, the difference between the less than 2 wins percentage is much larger than the difference between the more than 2 wins percentage.
Using this data, make your own decision! Hope you win more than 2, at least :D
-FruDe
P.S. This was my first ever math answer on StackExchange, tell me what you think!