I was running a procedure to be like one of those games were people try to guess a number between 1 and 100 where there are 100 people guessing.I then averaged how many different guesses there are.
from random import randint
def averager(times):
tests = list()
for _ in range(times):
l = [randint(0,100) for _ in range(100)]
tests.append(len(set(l)))
return sum(tests)/len(tests)
print(averager(100))
For some reason, the number of different guesses averages out to 63.6
Why is this?Is it due to a flaw in the python random library?
In a scenario where people were guessing a number between 1 and 10
The first person has a 100% chance to guess a previously unguessed number
The second person has a 90% chance to guess a previously unguessed number
The third person has a 80% chance to guess a previously unguessed number
and so on...
The average chance of guessing a new number(by my reasoning) is 55%. But the data doesn't reflect this.
Suppose that $n$ guesses are made. For $i=1$ to $100$, let $X_i=1$ if $i$ is not guessed, and let $X_i=0$ otherwise. If $$Y=X_1+X_2+\cdots +X_{100},$$ then $Y$ is the number of numbers not guessed.
By the linearity of expectation, we have $$E(Y)=E(X_1)+E(X_2)+\cdots+E(X_{100}).$$ The probability that $i$ is not chosen in a particular trial is $\frac{99}{100}$, and therefore the probability it is not chosen $n$ times in a row is $\left(\frac{99}{100}\right)^n$. Thus $$E(Y)=100 \left(\frac{99}{100}\right)^n.$$
In particular, let $n=100$. Note that $\left(1-\frac{1}{100}\right)^{100}\approx \frac{1}{e}$, so the expected number not guessed is approximately $\frac{100}{e}$. Thus the expected number guessed is approximately $63.2$, a result very much in line with your simulation.
In general, if $N$ people choose independently and uniformly from a set of $N$ numbers, then the expected number of distinct numbers not chosen is $$N\left(1-\frac{1}{N}\right)^N.$$ Unless $N$ is very tiny, this is approximately $\frac{N}{e}$, and therefore the expected number of distinct numbers chosen is approximately $N-\frac{N}{e}$. Note that the expected proportion of the numbers chosen is almost independent of $N$.