I have a question similar to this one.
If for a group of 85 riders, 40 riders fall over 24 races, then the probability of a rider falling in any single race is:
$$ 1 - (1 - p)^{24} = \frac{40}{85} $$
Where $ p \approx 0.0262 $.
However, using this probability, if I run 24 simulations (races) of 85 binomial draws (riders) with $p$ probability (replicated 10k times), a fall occurs, on average, $ \approx 53 $ times, not the expected $ \approx 40 $ times.
Linking back to this question, however, using their example parameters, in which there are 98 riders, with 4 falling over 52 races. If I run 52 simulations (races) of 98 binomial draws (riders) based on the probability of a rider falling in their example ($p \approx 0.0008$) (replicated 10k times), a fall does occur, on average, $ \approx 4 $ times, as expected.
I'm really struggling to understand what about my interpretation of $p$ is creating this issue with the simulation.
Thanks
R code for simulations:
> mean(replicate(10000, sum(replicate(24, rbinom(85,1,0.0262)))))
[1] 53.4684
> mean(replicate(10000, sum(replicate(52, rbinom(98,1,0.0008)))))
[1] 4.0213
The reason why you are not getting consistent results is because your simulation does not model what the parameter estimate $p$ actually represents. The original question is this:
As such, your simulation does not model this situation because yours counts falls, not riders. If a single rider falls multiple times over $r$ races, you count each instance of a fall by the same rider.
So the way you would properly simulate this scenario is to generate $n$ iid geometric random variables $G_1, \ldots, G_n$ with parameter $p$. $G_i$ represents the random number of races that rider $i$ will participate in until they experience their first fall. Under this model, the appropriate parametrization of $G_i$ is $$\Pr[G_i = g] = p (1-p)^{g-1}, \quad g \in \{1, 2, 3, \ldots \}.$$ Then count the number of $G_i$ that are less than or equal to $r$, meaning they fell at least once over $r$ races. This is your $X$, and you would simulate many observations of $X$. The quantity $X/n$ gives an estimated proportion of riders falling over $r$ races.
In your case, you have $n = 85$, $r = 24$, and you observed $X = 40$. Then your estimate of $p$ is $$\hat p = 1 - (1 - X/n)^{1/r} = 1 - (9/17)^{1/24} \approx 0.0261515.$$ When I simulate this in Mathematica using the code
I get the following: