Expectation of Negative Multinomial Distribution

284 Views Asked by At

If a trial consists of throwing an n-sided fair die having numbers a1,a2,a3...,an on its faces. What will be the expected number of trials required before we get atleast k1 times a1, k2 times a2,....kn times an.

I think it can be modelled as the expected value of negative multinomial distribution because each individual follows a multinomial distribution. In the simpler case where the trial is binomial, we can model "The expected number of trials required before we get k successes" as negative binomial.

An example for understanding...suppose that there is 3 sided dice with numbers 1,2 and 3 and I want to know the expected number of trials before I get to see say 4 1s, 5 2s and 6 3s.

PS: I cannot find any good free resource available on the net on Negative multinomial distributions

3

There are 3 best solutions below

3
On BEST ANSWER

I don't normally like to make two answers to the same question, but this approach is so different from my prior answer that it seems like the right thing to do. Again, I'll just discuss your example.

We can model the problem as a finite-state absorbing Markov chain. We represent the state of the system as an ordered triple $(i,j,k)$ with $0\leq i\leq4,\ 0\leq j\leq5,\ 0\leq k\leq6.$ This means that $i$ $1$'s, $j$ $2$'s, and $k$ $3$'s have been rolled, except that if $i=4$ it means that at least $4$ $1$'s have been rolled, and similarly when $j=5$ or $k=6$. The chain has $210$ states, and the state $(4,5,6)$ is the only absorbing state.

As explained on the Wikipedia page, there is an exact formula for the expected time to absorption. I wrote a python script to calculate it.

import numpy as np
from functools import reduce
import itertools
import sys

def product(seq):
    return reduce(lambda x,y:x*y, seq, 1)

def indexFunction(seq):
    s = [s+1 for s in seq]
    coeffs = [1]
    for t in s[:-1]:
        coeffs.append(t*coeffs[-1])

    def index(seq):
        z = zip(coeffs, seq)
        return sum(a*b for a,b in z)

    return index

def transitionFunction(seq):
    def trans(state, i):
        state = list(state)
        state[i] = min(state[i]+1, seq[i])
        return state
    return trans

def rolls(seq):
    n = product(s+1 for s in seq)  # number of states
    p = 1/len(seq)                         # probability of given roll
    states = itertools.product(*(range(s+1) for s in seq))
    index = indexFunction(seq)
    trans = transitionFunction(seq)

     # build transition matrix
    Q = np.zeros((n,n))  
    for s in states:
        source = index(s)
        for i in range(len(seq)):
            target = index(trans(s, i))
            Q[source, target] += p

     # expected time to absorption        
    Q= Q[:-1,:-1]         
    I = np.eye(n-1)
    N= np.linalg.inv(I-Q) 
    one = np.ones((n-1))
    return (N@one)[0]

seq= [int(arg) for arg in sys.argv[1:]]
for idx, t in enumerate(seq):
    print('%d occurs at least %d times'%(idx+1,t))
print(rolls(seq), "expected rolls")

Assuming this script is named rolls.py, then

python rolls.py 4 5 6

produces

1 occurs at least 4 times
2 occurs at least 5 times
3 occurs at least 6 times
21.389264801531347 expected rolls

so about $21.4$ rolls are required.

This script will work for any number of faces on the die, and any number of required occurrences, so long as the overall matrix doesn't get too big.

6
On

I'll just treat your example. I'm not sure how difficult it would be to express this in closed form; I haven't tried. We must distinguish between the cases where the last throw, the one that fulfills all the conditions, is a $1,2,$ or $3$. Suppose it is a $1$. Then we know that we threw $k\geq5$ $2$'s and $j\geq6$ $3$'s and that in the $k+j+3$ rolls prior to the last we rolled exactly $3$ $1$'s. We can make similar analyses when the last roll was a $2$ or a $3$. The expected number of rolls is $$\sum_{k=5}^\infty\sum_{j=6}^\infty(k+j+4){k+j+3\choose3,k,j}3^{-(k+j+4)}+\\ \sum_{i=4}^\infty\sum_{j=6}^\infty(i+j+5){i+j+4\choose4,i,j}3^{-(i+j+5)}+\\ \sum_{i=4}^\infty\sum_{k=5}^\infty(i+k+6){i+k+5\choose5,i,k}3^{-(i+k+6)} $$ where, of course, the first sum deals with the case where a $1$ is rolled last, the second where $2$ is last, and the third where $3$ is last.

0
On

Here is a paper that addresses the moments of the negative multinomial distribution:

https://doi.org/10.3390/mca28040085