Why is Matt Parker approximation of the average of the highest value of M N-sided dice more and more wrong for higher M?

66 Views Asked by At

In his video about the unexpected logic behind rolling multiple dice and picking the highest, Matt Parker, says that the formula in the end is

$$\frac{m}{m+1}\times n+0.5$$

where $m$ is the number of dice, and $n$ the number of sides on the dice.

I wrote a program that computes the actual values of the averages by simulating all the possible rolls for 20-sided dice (up to 8 dice because it takes too much more time after that). The values I get from my program are also very similar to Matt's values when he simulates the rolls, but those values differ by a greater value at every increase of $m$.

$m$ Matt's formula result Computed result Difference
1 10.500 10.500 0
2 13.833 13.825 0.008
3 15.500 15.487 0.013
4 16.500 16.483 0.017
5 17.166 17.145 0.021
6 17.642 17.617 0.025
7 18.000 17.970 0.030
8 18.277 18.244 0.033

Furthermore, if we take much higher values like $m=99$, we get an average of $20.3$, which is absurd because the average is then higher than the maximum rollable value.

So to me, the $+0.5$ part is totally hand-waived and isn't correct from a mathematical point of view.

So where is Matt wrong with the $+0.5$ part, and how to fix his formula so that it's accurate?

1

There are 1 best solutions below

1
On BEST ANSWER

You can find an expression for the expectation:

  • Probability highest is $i$ or less is $\left(\frac in\right)^m$

  • Probability highest is exactly $i$ is $\left(\frac in\right)^m-\left(\frac {i-1}n\right)^m$

  • Expected value of highest is $\sum\limits_{i=1}^n i\left(\left(\frac in\right)^m-\left(\frac {i-1}n\right)^m\right) = n - \frac{\sum\limits_{i=1}^{n-1} i^m}{n^m}$

and you will find that this last gives the values you have calculated with $n=20$.

That sum of powers is not in closed form. For each $m$ it can be expressed as a degree $m+1$ polynomial of $n$ with the terms starting (unchecked) $\sum\limits_{i=1}^{n-1} i^m=\frac1{m+1}n^{m+1} -\frac12 n^{m}+\frac{m}{12}n^{m-1}-\cdots$.

Plug that into the expected value and you get $n-\frac1{m+1}n +\frac12 -\frac{m}{12} n^{-1}+ \cdots$. If you keep $m$ fixed and increase $n$, then the terms $-\frac{m}{12} n^{-1}+ \cdots \to 0$ and so you might use $\frac{m}{m+1}n+\frac12$ as an approximation.

While that might be reasonable for large $n$ and small $m$, it is not going to be a reasonable approximation for fixed $n$ and increasing $m$ (i.e. more dice with the same number of sides), as you have noticed. Putting in one extra term to say $\frac{m}{m+1}n+\frac12-\frac{m}{12} n^{-1}$ works reasonably well for your $n=20$ and $1\le m\le 8$ example (for $m=8$ it gives about $18.4444$ compared to the more exact $18.4445$) but less so for $m=99$ (it gives $19.8875$ compared to the more exact $19.9937$ and you really need more terms). So this too is not a good approach for large $m$. Use the full sum of powers instead.