roll some dice, each with different number of sides, what are the odds of that roll to happen?

465 Views Asked by At

Suppose you have dice each with 4, 6, 8, and again 8 sides, and you roll them. What are the odds of the result being 2 4 5 7?

The number 7 must come from an 8-sided die, 5 must come from a 6 or 8-sided die, and 2 may come from any of the given dice.

"Bruteforcing" this is kind of easy, here are all the possibilities:

4 6 8 8 : sided of the correspoding die

2 4 5 7
4 2 5 7
2 5 4 7
4 5 2 7
2 4 7 5
4 2 7 5
2 5 7 4
4 5 7 2

The answer is 8, (or should be, I did this on paper and it could be wrong).


If the roll is 1 5 8 8 then there is only 1 possible outcome, both the 8 sided dice are an 8, the 6 sided die is 5, and the last is 1.


What is the number of possible outcomes in a generic case? Is there a neat formula I could use?

I need all this because I want to optimize a script I'm writing, as currently it does not scale well with many dice. The number of dice and faces of each die are parameters to a function. It may be 10 20-sided dice or even a single hypotetic 1-sided die. What I need is a solution or formula that also considers the fact that a number could come from a subset of the dice. (copy and pasted from below)

2

There are 2 best solutions below

0
On

It's possible to compute the solution without an exponential-time enumeration of all possible dice rolls. We can do this by iterating through the possible numbers from highest to lowest (in this case, 8, 7, 6, 5, 4, 3, 2, 1), and computing the distribution of how many dice rolled that number using the binomial distribution. Then we can put a state transition function on top of that based on the rules of the game, in this case seeing whether the pool rolled the required amount of that outcome. Here's a fuller explanation of the algorithm.

I implemented this approach as part of my hdroller Python library. You can try this computation in your browser using this JupyterLite notebook.

import hdroller
from collections import Counter

class ContainsSubset(hdroller.EvalPool):
    def __init__(self, target_subset):
        # target_subset: a dict-like mapping outcomes to counts.
        # The result is True iff the pool rolls each outcome at least that many times
        # for all items in the pool.
        self.target_subset = target_subset
        
    def next_state(self, state, outcome, count):
        if state is None:
            state = True
        if self.target_subset.get(outcome, 0) > count:
            state = False
        return state
    
    def direction(self, *_):
        # No preferred direction.
        return 0 

# A pool of d4, d6, d8, d8, looking for a 2, 4, 5, and 7.
print(hdroller.standard_pool(4, 6, 8, 8).eval(ContainsSubset(Counter([2, 4, 5, 7]))))

Denominator: 1536

Outcome Weight Probability
False 1528 99.479167%
True 8 0.520833%

So 8 is correct in this case.

0
On

If you have many dice but the individual numbers are not too large (for example, working with dice up to $20$ sides, not $20000$ sides) you can form a combinatoric formula.

Sort the numeric values that were rolled and work from the highest value to the lowest value.

For each value, count the number of dice, $k$, that had that value showing. Then count the number of dice, $n$, that could have had that value showing, not including any dice that we already know have rolled to higher values. The number of ways that this particular value could occur on the given set of dice with the given list of values showing is then $\binom nk.$

For example, suppose we have one four-sided die, two six-sided dice, and four eight-sided dice, and suppose the numbers rolled are (in increasing order) $2, 4, 5, 5, 6, 6, 7.$

There is only one $7,$ but four dice it can occur on, so there are $\binom 41 = 4$ ways for the $7$ to occur.

There are two $6$s. There are five remaining dice on which they can occur: the two six-sided dice and the three eight-sided dice that are not already showing a $7.$ So there are $\binom 52 = 10$ ways for $6$ to occur.

There are two $5$s. There are three remaining dice on which they can occur: any of six- or eight-sided dice that are not already showing $6$ or $7.$ So there are $\binom 32 = 3$ ways for $5$ to occur.

The single $4$ has two remaining dice on which it can occur. So it can occur in $\binom 21 = 2$ ways.

The single $2$ now has only one die on which it can occur, so it can occur in $\binom 11 = 1$ way.

So in all there are $4\times 10\times 3 \times 2\times 1$ ways to roll these numbers.

For each numeric value, the number of dice it can be rolled on (for this count) is the number of dice that have that number of sides or greater minus the number of times greater values appear in the list. In the previous example, for the value $5$ there are six dice that have sufficient number of sides but greater values occur in the list three times ($6, 6, 7$), which is why there are only three remaining dice that could roll a $5.$


In some cases you may have multiple numeric values that can only occur on the same set of dice. In the example above, there are exactly six dice on which either a $5$ or a $6$ can appear, and when we subtract one die that must show a $7$ we have five dice left. The combined number of ways to show $5$ or $6$ is then the multinomial coefficient $$\binom 5{2\ 2\ 1} = \frac{5!}{2!2!1!},$$ where the terms $2, 2, 1$ come from two $5$s, two $6$s, and one remaining die. Note that $$ \frac{5!}{2!2!1!} = \frac{5!}{2!3!} \times \frac{3!}{2!1!}, $$ so in the end this is the same as the result we get from counting just the $6$s first and then the $5$s, but the $3!$ in one binomial cancels the $3!$ in the other.

I'm not sure if the use of multinomial coefficients has sufficient benefit to be worth including in a computer program.


Another way to do the calculation is a two-phase process. In the first phase, take the values from the list one at a time, starting with the largest, and for each value find how many dice, $n,$ remain that could show that value -- that is, how may have that number of sides or greater, minus the number of values already taken from the list. Form a product of the values of $n$ found by this process. For the example above, the product is

$$ 4 \times 5 \times 4 \times 3 \times 2 \times 2 \times 1 $$

since the $7$ can occur on four dice, the first $6$ can occur on five remaining dice, the second $6$ on four remaining dice, etc.

In the second phase we form a different product. Identify all the repeated groups of values in the list of values. If there are exactly $k$ occurrences of a particular value, put $k!$ in the product of this phase. In the example there are two repeated groups -- two $5$s and two $6$s -- so we have $k=2$ twice, and the product is $2! \times 2!.$

For the final result, divide the first product by the second product:

$$ \frac{4 \times 5 \times 4 \times 3 \times 2 \times 2 \times 1}{2! \times 2!}. $$

But this way tends to produce larger intermediate results than the other ways, which is generally not a good thing.


Of course if someone has already written an efficient library for computing things like this, and you can use that library, that would be an attractive solution.