Probability of dice pool throw >= than given dice pool (ordered, unsorted)

125 Views Asked by At

Assume we have a game with two dice pools. Each dice pool consists of k n-sided dices (all dices are fair, i.e. each side has the same probability to show up). The first dice pool is rolled once and the player rolls the second dice pool r times.

A success is when a throw of the second dice pool results in each dice being greater or equal than the dice of the first dice pool at the specific index (all dice pools are kept unsorted, but the order is important). So the 1st dice of the 1st dice pool needs to be greater or equal than the 1st dice of the 2nd dice pool, and the 2nd dice of the 1st dice pool needs to be greater or equal than the 2nd dice of the 2nd dice pool, and so on on for each dice index.

How can we calculate the probability of a success dependant on r (throws of the second dice pool)?

Below are two ways to calculate the probabilities. The first is unfortunately not usable in my case as k is in the range of 32 and n is in the range of 256.

EDIT: r is evaluated in the range of $2^0$ to $2^{80}$.

The second approach seem to result in overestimated probabilities, while I am uncertain why this is the case.

EDIT2: As the parameters I have in my example are challenging to compute any estimation to approximate the probabilites are helpful as well.

Explicit Calculation

  • k = 3 (# dices in each pool);
  • n = 2 (# sides of each dice);
  • r = 4 (# throws of second pool)

In total there are 8 possible outcomes. The probability $p_j$ to roll each slot higher for each j outcome is:

1st dice 2nd dice 3rd dice $p_j$
1 1 1 1
1 1 1/2 1/2
1 1/2 1 1/2
1 1/2 1/2 1/4
1/2 1 1 1/2
1/2 1 1/2 1/4
1/2 1/2 1 1/4
1/2 1/2 1/2 1/8

The probability depending on the number of r throws for each outcome can be calculated with:

$$1 - (1 - p)^{r}$$

We can sum up the probabilities and divide by the number of outcomes, which gives the overall probability for this case.:

$$ P(success) = \frac{1}{n^k} \sum_{j=1}^{n^k}{1 - ( 1 - p_j)^r}$$

$$ P(success) \approx 0.785 $$

The explicit calculation does also match with an experiment simulation. Unfortunately, this is limited as it is infeasible to calculate for high number of dices and sides of each dice.

Approach to calculate the probability without explicit calculation

The idea is to split each dice into an independent event of each other. The probability that one dice rolls higher than a given dice can be written as:

$$ P_{dice} = \frac{n+1}{2n} $$

The assumption is that each dice of the dice pool is independent. Thus, we can accumulate the probabilities over the k dices in each pool:

$$ P_{dicepool} = {\left(\frac{n+1}{2n}\right)}^k $$

To calculate the probability depending on the throws $r$, we can use the formulae in the first calculation, which results to:

$$ P(success) = 1 - {\left( 1 - {\left(\frac{n+1}{2n}\right)}^k \right)}^r $$

$$ P(success) \approx 0.888 $$

This approach seems to overestimate $ P(success) $.

2

There are 2 best solutions below

7
On BEST ANSWER

For each $i\in \{1,\dots,r\}$, let $F_i$ be the event that the second pool fails to beat the first pool on the $i^\text{th}$ attempt. The error in your second method is that the events $F_1,\dots,F_r$ are not independent of each other, so you cannot just multiply their probabilities to get the probability of their intersection. This is because we are reusing the same rolls for the first die pool every time. If the both the first pool and second pools were rerolled each time, then you would see the $88.8\%$ figure in simulations.


In light of the crytographic context provided, I will refer to the first die pool as Alice, and the second as Eve. The following is an upper bound for the probability that Eve succeeds: $$ \boxed{P(\text{Eve wins})\le r\cdot \exp\left(-k\left(1- \frac{\ln n}{2n}\right)+5\sqrt k\right) +2^{-18}} $$ Let $Q$ be the probability that Eve beats Alice with a single attempt; my $Q$ is your $p_j$. This $Q$ is a random variable depending on Alice's initial roll. If $Q$ is fixed, then the probability Eve beats Alice in $r$ attempts is $$\mathrm P(\text{Even wins})=1-(1-Q)^r\le rQ.$$

For each $i\in \{1,\dots,n\}$, let $A_i$ be the number of numbers in $\{1,\dots,n\}$ which are greater than or equal to Alice's $i^{th}$ roll. Note $A_i$ is uniformly distributed between $1$ and $n$. For Eve to win in a single round, all of her dice need to be at least that of Alice's, so $$ Q=(A_1/n)\cdot (A_2/n)\cdots (A_k/n), $$ Taking logs, $$\ln Q =\sum_{i=1}^k \ln (A_i/n)$$ Our strategy is to find the mean and variance of each $\ln A_i$, which tells us the mean and variance for $\ln Q$. Since $\ln Q$ is a sum of independent random variables, we can approximate $\ln Q$ by a normal distribution, and use that to get good bounds on $Q$. Note $$ E[\,\ln (A_i/n)\,]=\frac1n\sum_{i=1}^n\log \frac{i}n=\frac1n \log\frac{n!}{n^n}\approx -1+\frac{\ln n}{2n} $$ The $\approx$ follows from Stirling's approximation, $n!\sim (n/e)^n\sqrt{2\pi n}$. To upper bound the variance, we use the fact that in general, $\text{Var} X\le E[(X-a)^2]$, for any $a\in \mathbb R$. Taking $a=-1$, $$ \text{Var}[\ln (A_i/n)]\le E[(\ln (A_i/n)+1)^2]=\frac1n\sum_{i=1}^n (\ln \tfrac in+1)^2\approx \int_0^1 (\ln x+1)^2\,dx=1 $$ Therefore, we have that $\ln Q$ is approximately normal with mean $k(1-\ln n/2n)$, and standard deviation $\sqrt k$. In particular, we can say $$ \ln Q\le -k\left(1-\frac{\ln n}{2n}\right)+5\sqrt{k}\qquad \text{with probability at least }1-2^{-18} $$ This gives the estimate advertised at beginning.

Note that my formula above will never give an estimate better than $2^{-18}$, since this is the probability $\ln Q$ is more than five standard deviations over its mean, and we get no guarantee on $Q$ when that happens. If $2^{-18}$ is not strong enough, then you can instead pick any positive number $t$, and write $$ P(\text{Eve wins})\le r\cdot \exp\left(-k\left(1- \frac{\ln n}{2n}\right)+t\sqrt k\right)+\frac1{t\sqrt{2\pi}}\exp(-t^2/2) $$

2
On

This answer has been edited, to correct my misinterpretation of the posting.


Assume that each of the group 1 dice are labeled $A_1, A_2, \cdots, A_k$ and that each of the group 2 dice are labeled $B_1, B_2, \cdots, B_k.$ Assume that each of the $2k$ dice have the faces represented by the set $\{1,2,\cdots,n\}$.

Assume that the group 1 dice are rolled once (only), and show the values $a_1, a_2, \cdots, a_k$ respectively. Assume that these group 1 dice are never re-rolled.

Assume that on each roll of the group 2 dice, the values of $b_1, b_2, \cdots, b_k,$ respectively, are shown.

Then, a totally successfull roll is where $b_1 \geq a_1, \cdots, b_2 \geq a_2, \cdots b_k \geq a_k.$

A (somewhat convoluted) closed form expression can be obtained for the probability that at least one of the $r$ rolls of the group 2 dice is totally successful, as follows:

You can assume that $n^k$ simulations are run, and that on each simulation, a unique ordered $k$-tuplet of $(a_1, a_2, \cdots, a_k)$ is generated.

So, you evaluate the probability of success on each of the $n^k$ simulations, add them all together, and then divide by $n^k.$


For a given simulation, assume that the group 1 dice are represented by the ordered $k$-tuplet $(a_1, a_2, \cdots, a_k).$

Let $f(a_1, \cdots, a_k)$ denote the probability that a specific roll of the group 2 dice is totally successful.

Then

$$ f(a_1, \cdots, a_k) = \prod_{i=1}^k \frac{n+1 - a_i}{n}.$$

Let $g(a_1, \cdots, a_k)$ denote the probability that in $r$ re-rolls of the group 2 dice, each of the re-rolls fails to be totally successful.

Then

$$g(a_1, \cdots, a_k) = \left[1 - f(a_1, \cdots, a_k)\right]^r.$$

Let $p(k,n,r)$ denote the probability that each of the $r$ re-rolls of the group 2 dice fails to be totally successful, given that $k$ $n$-sided dice are used for the group 1 and group 2 dice.

Then

$$p(k,n,r) = \frac{N}{n^k},$$

where

$$N = \sum_{a_1 = 1}^n ~\sum_{a_2 = 1}^n \cdots ~\sum_{a_k = 1}^n ~g(a_1, \cdots, a_k).$$

This algorithm should be routinely computable by a computer program. I suggest using floating point arithmetic with (for example) $10$ decimal places of accuracy. Then, the overall rounding error should be negligible.


Addendum

As has been discussed in the comments following the answer of Mike Earnest, while the above algorithm has the advantage of simplicity, it is probably not practical when (for example) you have $(n,k) = (256,32).$

That is, my PC does okay handling $(10)^7$ cases but starts to balk at handling $(10)^8$ cases. So, my PC will even have trouble with (for example) $(n,k) = (256,8).$

However, a minor adjustment to the algorithm allows the computation to be approximated for $(n,k) = (256,8).$

Consider the difference between computing $(n,k) = (256,8)$ and $(n,k) = (8,8)$. Examine the comparison between the two dice $A_1$ and $B_1$.

Assume (for example) that $a_1$ is in the range $[129,160]$. $\dfrac{7}{8}$-th of the time, $b_1$ will not be in that range. When $b_1$ is not in that range, the distinction between $n=256$ and $n=8$ is irrelevant.

So, when using $n=8$ to estimate the results, the question is : how should the $~\displaystyle \frac{8 + 1 - a_1}{8}$ computation be adjusted?

When $a_1$ and $b_1$ are both in the $[129,160]$ range, then the probability that $b_1 \geq a_1$ will be $\dfrac{33}{64}.$ In fact, the computation of $\dfrac{33}{64}$ will hold whenever $b_1$ and $a_1$ are in the same range of $32$ values.

Therefore, a reasonable adjustment that should allow $p(8,256,r)$ to be approximated by $p(8,8,r)$ is to alter the computation of $f(a_1,a_2,\cdots,a_8)$ to be

$$ f(a_1, \cdots, a_8) = \prod_{i=1}^8 \frac{8+\color{red}{\dfrac{33}{64}} - a_i}{8}.$$