Probability that my set of dice 'wins'

700 Views Asked by At

My question is about probabilities. Given are two sets of regular, 6-sided, fair dice of at least one die (but no upper bound on the number of dice). Now, 1 of the sets is considered to be 'the winner'. This is determined with the following method:

Sort both sets on the number of eyes. Compare the dice of both sets 1-by-1 starting at the highest by pairing up the dice from both sets (compare 1st with 1st, 2nd with 2nd etc.). As soon as 1 die is higher than the other, then the respective set is the winner and the other dice are ignored. If both sets have an equal number of dice and all dice are equal, then it's a tie. If it's a tie except that one set still has dice left (is a bigger set), then that set is the winner.

Examples:

A: 6 5 5 (winner)
B: 6 5 4

A: 4 (winner)
B: 1 1 1

A: 6 5 2 2
B: 6 5 2 2
(tie)

A: 6 5 1 (winner)
B: 6 5

Question: How can I calculate the probability that a certain set wins after you throw all dice, given only the sizes of both sets?

Edit: The answer can either be a function with 2 inputs, or if this is not possible, an algorithm that calculates this. In practice, the number of dice will be very small (usually smaller than 5).

Edit 2:

Context: These are the rules for resolving a speed roll to determine initiative in a fight between gladiators in the board game called Spartacus. I'm programming a hobby project where I'm simulating such fights and I want to use the probability of either gladiator winning initiative. The probability is used in a Minimax algorithm to generate child positions of the starting position.

Disclaimer: I'm not a mathematician and I have little mathematical knowledge but I do know algebra etc. I'm looking for an answer in the form of an equation with 2 unknowns: sizes of both sets or an algoritm that accepts those inputs

5

There are 5 best solutions below

5
On

Getting an exact answer is going to be immensely difficult; in lieu of that, here's a very rough first-order approximation. I'm going to make one simplification: rather than a 6-sided die, I'm going to let the number of sides on the die 'go to infinity': player A picks $a$ real numbers uniformly between 0 and 1, while player B picks $b$ real numbers, and A wins if their largest number is greater than B's largest. The continuous nature of this version of the problem means that ties are 'impossible' (0 probability), so we don't have to worry about tiebreakers; we can just look at the largest value for each player.

Now, the cumulative distribution function of the minimum of $n$ real numbers chosen uniformly from [0,1] is well-known: it's just $F(x)=x^n$.

So how do we express the probability that a random sample from the distribution with CDF $F_A(x)$ is greater than a random sample from the distribution with CDF $F_B(x)$? Well, a 'measure' on A's distribution is $dF_A(x) = F'_A(x)dx$; and given any point $t$ in [0,1], the probability that a sample from B's distribution is less than t is $F_B(t)$ by definition! So the probability that a sample from A is greater than than a sample from B is $\int_{t=0}^1F_B(t)F'_A(t)dt$. Integration by parts tells us that the probability that a sample from B is greater than a sample from A is exactly 1 minus this: $\int_{t=0}^1F_A(t)dF_B(t) = \left[F_A(x)F_B(x)\right]\big|_{x=0}^1 - \int_{t=0}^1F_B(t)dF_A(t)$, and the quantity in braces is $F_A(1)F_B(1)-F_A(0)F_B(0) = 1-0 = 1$.

Now, if $A$ has $a$ dice and $B$ has $b$ dice, then $F_A(x)=x^a$ and $F_B(x)=x^b$, so $F'_A(x) = ax^{a-1}$. For the integral we have $\int_{t=0}^1t^bat^{a-1}dt$ $= \int_{t=0}^1at^{a+b-1}dt$ $= \left(a\dfrac{t^{a+b}}{a+b}\right)\big|_{t=0}^1$ $=\dfrac{a}{a+b}$; in other words, A's winning chance is exactly equal to the proportion of the 'dice' that they have. This should provide a reasonable first approximation to the probability you need. Note that this won't be exact even in the cases where A and B can't tie: for instance, if A has four dice and B has three then A's winning probability must be some multiple of $\frac1{6^7}$ since that's the size of the probability space, but this approximation gives A's chances at $\frac47$, which can't be of that form since $7$ doesn't divide $6^7$.

ETA: as Erick Wong notes in a comment below, this approximation gets much worse in the regime where the number of dice is large compared to the number of values they can take on, since discrete effects come into play more strongly there (both A and B are likely to have rolled multiple sixes if A rolls 150 dice and B rolls 100, for instance, and A is much more likely to have rolled more than B than this estimate would suggest). Some quick numerical simulations suggest that the estimate is okay(ish) in the realm where a and b are comparable to the number of sides; for instance, it gives $\frac47\approx 0.57\ldots$ for the chance of A winning a 4:3 roll, whereas the actual probability seems to be slightly less than $0.61$. OTOH, for something like a 9:2 roll it gives A's probability as $\frac9{11}\approx 0.82\ldots$, whereas the actual probability seems closer to $0.92$

2
On

Well, this an ugly, brute force approach, but since you are programming brute force is fine since it makes an easy algorithm!

Let's start with a simple number of dice and then abstract to a formula. Say we have 3 dice. The odds of all 3 being 6s are $(\frac 1 6)^3$. This is the same as the odds of 3 of any of the same die roll. The odds of 2 6s and 1 5 are $3\cdot (\frac 1 6)^3$ since there are 3 different dice on which the 5 could occur. Again, this is the same patternfor every other 2 of 1 1 of another The odds of 6 5 4 (or any other individual set of 3 numbers) is $6 \cdot (\frac 1 6)^3$, 6 because the 3 dice can come in any order, so you have $3\cdot 2\cdot 1$ patterns

Generalizing now, if you have $k$ dice, we need to partition $k$ into how many times a specific face is duplicated. It's easier to sort by block sizeIn other words, we need numbers $a_1,a_2,a_3,a_4,a_5,a_6$ where each $a_i$ tells you how many times the $i'th$ largest block shows up. so $a_1$. So $a_1+a_2+a_3+a_4+a_5+a_6=k$ and each $a_1\geq a_2\geq a_3\geq a_4\geq a_5\geq a_6\geq 0$. Once we figure out a particular block pattern, all the distributions of that block pattern are equally likely (Ie if we have 10 dice and the block patterns are 4 of one face, 4 of another, and 2 of a third, and we would have $a_1=a_2=4, a_3=2, a_4=a_5=a_6=0$ you get the same chance for 4 5's, 4 3's and 2 1's as any other assignment of those dice)

Our biggest block can be anywhere from $\max\{1,\lceil \frac k 6\rceil \}$ to $k$ (we must have at least 1 die of biggest size, and for each time we pass a multiple of 6, we get a second die type of biggest size).

So, for each $n$ in $ \max\{1,\lceil \frac k 6\rceil \}\leq k$ we calculate the amount of ways to get $n$ of a specific size out of $k$ rolls, those can happen in $k \choose n$ ways out of $({\frac 1 6})^k$ die rolls.

Now we enter recursion land, if $a_1=k$ we are done, otherwise for possible sizes of $a_2$ we have $\max\{1,\lceil \frac {k-a_1} 6\rceil \}\leq a_2\leq k-a_1$

(same formula as before only taking away the $a_1$ dice already chosen.

We've already specified where the first $a_1$ dice are so the choices for each $n$ in this range are out of the pool $k-a_1$, giving us a probability for $a_2=n$ of ${k-a_1 \choose n}$, we can multiply this times the previous result by fundamental counting principle.

The same thing holds for each $a_i$, if $\sum_{j=1}^{i-1}a_i=k$ then $a_i=0$, else we have possible combinations for $a_i$ of $\max \{1, \lceil \frac {k-\sum_{j=1}^{i-1}a_i} 6 \rceil \} \leq n \leq k-\sum_{j=1}^{i-1}a_i$ the possible combinations are $k-\sum_{j=1}^{i-1}a_i \choose n$

Once again, you multiply each of the possibilities times each other for total combinations of that time, then then add a factor of $\frac 1 {6^k}$ to get probability. Each pattern shows up with that probability, so for chance $a_1=4,a_2=2$ with the rest 0, you would fill out an option for 4 6's, 2 5s all the way down to 4 2's, 2 1s. Then relabel them with a new set of numbers, $b_6,b_5,b_4,b_3,b_2,b_1$ where each $b_i$ is the number of die face $i$ that shows up (what you really care about). In this ordering, the 0s can show up anywhere.

Do the exact same procedure for the other person, replacing $j$ dice they roll with the $k$ the first person rolled.

Now, we create the events that person A rolled one of their options on the chart and B rolled one of their options, the probability of that event is the product of the two. You will have a lot of entries to compute, (#A)*(#B), which can get large if $j,k$ are large. Within each event, whoever has the higher $b_6$ wins, if tied, $b_5$, etc., if all tied then larger die size, which can be checked with a simple boolean statement.

Add up all the probabilities of the wins of A, the wins of B, and the ties, and you're done. Probably under a page of computer code, depending on how well the loops can be written.

5
On

The range of results $n$ dies produce is $n$ to $6n$. This means there are $6n - (n-1) = 5n + 1$ possible results for each die, meaning the total number of possibilities is $(5a + 1)(5b + 1)$. The number of times the smaller party wins is the sum of numbers from 1 to its number of dies - 1:

(showing ties)

enter image description here

We can see that wins for 'column' party are $1 + 2 + 3 + 4 + 5$, or

$$\sum\limits^{6c-1}\limits_{n=1} n$$

where $c$ is actually the size of the smallest party.

Dividing this, by the number of possibilities should yield the probability (unless I am an ignorant or have done something wrong).

So, let $c$ be the smallest groups size, $a$ be the first group's size and $b$ be the second group's size, following the logic previously stated(which might be incorrect) we get:

$$f(a, b) = \frac{\sum\limits^{6c-1}\limits_{n=1} n}{(5a + 1)(5b + 1)}$$

I tried it with the simple test case of a = b = 1 and it works, giving the answer $\frac{15}{36} = \frac{5}{12}$.

0
On

In the case that both players have an equal number of dice (sorry, this isn't a complete answer, but I did want to post my efforts for part of the question), clearly both players are evenly matched and do not have any advantage over each other. Hence, because of the symmetry their probabilities of winning must be equal. Let's say that probability is $p$. The outcome of the match for a player must be "win", "lose", or "tie". We have already concluded that the probability of winning and losing are each both $p$.

Let's say both players have $n$ die. Tieing occurs when both players roll the same distribution of dice (it does not have to be the same order). For any distribution of $a_i$ rolls of $i$, where $$i\in[\![ 1,6]\!]$$ $$a_i\in \mathbb{N}_0~\forall i$$ $$\sum_{i=1}^6 a_i=n$$

We have that the probability of any certain distribution of happening is $$\frac{\binom{n}{a_1,a_2,\ldots, a_6}}{6^n}$$ of it occuring in a single roll of a set of $n$ dice. For the same distribution to occur in both sets of rolls, the probability of this occurring is $$\frac{\binom{n}{a_1,a_2,\ldots, a_6}^2}{6^{2n}}$$ We want to sum all these probabilities over all possible distributions, which is $$p=\sum_{\sum_{i=1}^6 a_i=n}\left(\frac{\binom{n}{a_1,a_2,\ldots, a_6}^2}{6^{2n}}\right)$$ $$p=6^{-2n}\sum_{\sum_{i=1}^6 a_i=n}\binom{n}{a_1,a_2,\ldots, a_6}^2$$

Hence, the probability a player wins when both have the same number of dice is $$\frac{1-p}{2}$$ $$\boxed{\frac{1}{2}-\frac{1}{2\cdot 6^{2n}}\sum_{\sum_{i=1}^6 a_i=n}\binom{n}{a_1,a_2,\ldots, a_6}^2}$$

It looks like that sum does not have an explicit formula (but you can try manually computing for small $n$), but according to theorem 4 from this paper, we have that for large $n$, $$\sum_{\sum_{i=1}^6 a_i=n}\binom{n}{a_1,a_2,\ldots, a_6}^2\approx 6^{2n+3}(4\pi n)^{-\frac{5}{2}}$$ Hence, the probability of winning approaches $$\frac{1-6^3(4\pi)^{-\frac{5}{2}}n^{-\frac{5}{2}}}{2}$$ $$\approx \frac{1-.386n^{-\frac{5}{2}}}{2}$$ This also shows that the probability of tieing approaches $0$.

3
On

Here is an alternate, very rough approximation for large $a,b$. Another answer establishes that the probability of ties converges to $0$. Likewise I’d expect that the probability of a tie on just $6$s is also small. In that case we can just model the difference in the number of sixes as a random variable normally distributed with mean $(a-b)/6$ and variance $5(a+b)/36$.

The probability that this is positive (i.e. that the first player wins by having more 6s) is the same as a standard normal variable being $< (a-b)/\sqrt{5(a+b)}$. This cumulative distribution can be computed using the erf or erfc function in many programming languages. But I wouldn’t trust it for single digit values of $a,b$.