What algorithms exist, if any, for pseudo-random dice rolling with non-binominal aggregate properties?

410 Views Asked by At

I wasn't sure whether to address this to CS SE, StackOverflow or Math SE, but here goes...

What algorithms exist, if any, for (pseudo-) random dice rolls such that some aggregate properties across many rolls are obeyed, for example,

I want to get to roll a 3-sided die (faces 'A', 'B', 'C') such that, if rolled 100 times:

  1. the expected numbers of 'A', 'B', 'C' are 90, 8, 2 respectively;
  2. the number of 'A' rolls will be between 89 and 91 with probability 67%

or some other similar such specification. I am still looking for algorithms that can provide a random(-looking) single roll; but over many rolls I do not want the cumulative results to follow a binomial distribution but rather one like I've specified. What should I look into for such pseudo-random rolling algorithms? (Pseudo-code, or actual code in say R or Python or Mathematica, would also be very appreciated)

EDIT: I know how to satisfy property (1), it's (2) I'm interested in

3

There are 3 best solutions below

4
On

For (1), choose a random integer between $1$ and $100$. Call the result $A$, $B$ or $C$ using the intervals $[1,90], [91,98], [99,100]$.

I know that's not the algorithm you ask for.

Intuition suggests that you cannot guarantee (2) with any algorithm. What coin tossing algorithm would you use to assure between $49$ and $51$ heads with predetermined probability in a $100$ tosses?

Response to comment.

Consider a random permutation of the sequence $(A(90), B(8), C(2))$. That will give you the exact frequencies for (1) and satisfy (2). This will not serve your purpose if the customer for these dice rolls knows your algorithm and can "count cards".

3
On

Please see if the below helps:


While you cannot do is to escape CLT, what you can do is to reduce the variance of the individual random variables you add.

Translating your requirement, you want a 36% CI that the sum of 100 RV's are between 89 and 91. This translates to a z value between $-0.47$ and $0.47$. Therefore the standard deviation of $100$ variables is $\frac{1}{0.47} = 2.127$, for a variance of $4.527$. Therefore the variance of each variable has to be $0.045$ around 0.9

You can generate this through a normal RV generator, or you can generate multiple N(0.9,1) and take their mean. In this case taking a mean of 480 $\mathcal N(0.9,1)$ random variables will generate a random variable $\sim \mathcal N(0.9,0.045)$.


Now question arises, how do you decide whether the throw is A or not. I suggest you keep recording $Y_n = 1-X_n$. Then record $S_n = \sum Y_n$. As soon as $\lfloor \sum Y_n\rfloor=\lfloor S_n\rfloor$ changes value, then you say "not A" and then use some other algorithm to determine between B and C (say by recording $u\sim \mathcal U(0,1)$ and deciding B if $u <0.8$, and C otherwise.

This is also susceptible to the "counting" and is not truly random. To make it slightly more random, if you knew requirement (2) was only at throw 100, and you have some leeway between throws 0-100, then you can decide to say "not A" sometime between when $S_n$ is some interval between "$integer-0.1$" and "$integer+0.1$". This can also be random. like "Not A" when $S_n$ goes above $integer +r$ where $r\sim \mathcal U(0,1)$ (different for each integer)


Here's some code as requested:

import numpy as np

def normal(size=1,mean=0,sigma=1):
    if size==1:
        return mean+sigma*np.random.randn()
    return mean+sigma*np.random.randn(size)

def ABCgenerator(size,mean,sigma,BCThreshold=0.5):
    Xn = normal(size,mean,sigma)
    Yn = 1-Xn
    Sn = Yn
    for index in range(len(Sn)):
        Sn[index] =Sn[index-1]+Sn[index]
    thresholds = np.array(range(size))+np.random.rand(size)

    outList = []
    j=0
    for i in range(size):
        if Sn[i]>thresholds[j]:
            if np.random.rand() < BCThreshold:
                outList.append("B")
            else:
                outList.append("C")
            j+=1
        else:
            outList.append("A")
    return outList




if __name__=="__main__":
    np.random.seed(8)
    print(ABCgenerator(100,0.9,0.045,0.8))
0
On

If you identify the exact distribution you want for the A, B, C tally over $100$ rolls, you can do a two-step procedure as follows:

  1. Make a random sample from the distribution of $100$-roll tallies; this is a triple $(a,b,c)$ with $a+b+c=100$ that tells us the number of each roll we get.
  2. For the next $100$ rolls, choose A with probability $\frac{a}{a+b+c}$, similarly for $b,c$. Then reduce the corresponding number in the triple by $1$.
  3. When the triple hits $(0,0,0)$, go back to step $1$.

This can satisfy both the conditions and additionally has the appealing feature that all rolls are identically distributed (but not independent).

Your conditions 1 and 2 don't specify the exact distribution we want. A very simple (but not very appealing) solution might be that the tally is uniformly chosen from the set $$\{(88,3,9), (89,2,9), (90,1,9), (90,3,7), (91,2,8), (92,1,7)\}.$$ You can check that this has the right expected values, and $89 \le a \le 91$ with probability $\frac23$.

I recommend going for more variety, subject to your requirements, so that you mitigate the "counting problem". With the above distribution, for example, you know that there's going to be between $1$ and $3$ B's. If you've gotten towards the end of the block of $100$ and you've already seen $3$ B's, you know that no more are coming until the next block.

This issue is not going to be eliminated completely no matter what you do: the only way not to gain information about future rolls from past rolls is to have the rolls be independent, and we know that doesn't produce the right distribution for $a$. But we can mitigate it by having more possibilities.

For example, you could select $13$ rolls by having A come up with probability $\frac{3}{13}$, B with probability $\frac{2}{13}$, and C with probability $\frac{8}{13}$. Then add $87$ rolls that are guaranteed to be A to the tally. (I choose $87$ and $13$ by playing around with some numbers; the probability that a $\textit{Binomial}(13, \frac3{13})$ is between $2$ and $4$ is approximately $67.8\%$, which is close to what you want.)

Here's some Mathematica code:

ABCTALLY = {0, 0, 0};

pickTally[] := {87, 0, 0} + RandomVariate[MultinomialDistribution[13, {3, 2, 8}/13]]

randomLetter[] := Module[{k},
  If[Total[ABCTALLY] == 0, ABCTALLY = pickTally[]];
  k = RandomChoice[ABCTALLY -> {1, 2, 3}];
  ABCTALLY[[k]]--;
  {"A", "B", "C"}[[k]]
]