What is the expected value of rolling $n$ dice and keeping the best $k$ of them?

535 Views Asked by At

Recently a friend posed me a problem:

You have $n$ dice, all of which have integer values ranging from $1$ to $j$, take on each of those values with probability $\frac{1}{j}$, and are independent of each other. If you roll those $n$ dice, then take the $k$ highest values that appear and sum them together, what is the expected value of this sum?

Now, when $k = 1$, this is simply the expected value of the maximum of $n$ i.i.d. random variables, which is not difficult to compute. But it becomes significantly trickier for even $k = 2$ (as then you are trying to maximize over the sum of two dice, which need not be independent), and so I am curious if anyone here has any good ideas.

Thank you in advance for your help!

3

There are 3 best solutions below

0
On BEST ANSWER

Thank goodness for Wikipedia and for Greg Martin letting me know what to look for. Using this page -- https://en.m.wikipedia.org/wiki/Order_statistic#Dealing_with_discrete_variables -- one can arrive at this answer:

$$\mathbb{E}(\text{Roll } X \text{ d}Y \text{ and keep } Z) =$$ $$\sum_{k=0}^{Z-1} \sum_{j=1}^Y j \sum_{l=0}^k \binom{X}{l}\Big(\Big(\frac{Y-j}{Y}\Big)^l\Big(\frac{j}{Y}\Big)^{X-l} - \Big(\frac{Y-j+1}{Y}\Big)^l\Big(\frac{j-1}{Y}\Big)^{X-l}\Big).$$

One quickly notices that $0^0$ is encountered in this sum - for our purposes, this is assigned a value of $1$.

Thank you for your help, all!

1
On

The term you're looking for is "order statistics". The expectation of the sum of the largest $k$ values is equal (by linearity of expectation) to the sum of the expectations of the highest $k$ order statistics.

Calculating those expectations in closed form is not trivial, but for your specific dice setup one could certainly write the answer as a summation and calculate it. Another example of estimating expectations of order statistics can be found in these notes.

0
On

We have $n$ iid variables $X = (X_1, \ldots, X_n)$ with $X_i$ uniform over $\{1,2, \ldots, d\}$. Let $U_i$ be the $i$-th smallest value of $\{X_1, \ldots, X_n\}$; fix $j$ and $k$. We will find $P(U_j = k)$. This completely determines the distributions of $U_j$; then we can determine the expectation of $U_j$ and finally the expectation of the sum of the $k$ largest, $U_{n-k+1} + \cdots + U_n$.

The strategy is divide-and-conquer. We look at all the possible ways we can have the $j-$th order statistic $U_j$ be $k$ and sum the probabilities.

Given samples $x = (x_1, \ldots, x_n)$, let $L_x = \{i: x_i < k\}$ and $R_x = \{i: x_i > k\}$, so that $x_i = k$ if $i \notin L, R$. Letting $|L_x| = l, |R_x| = r$, the $\operatorname{sorted}(x) = u = (u_1, u_2, \ldots, u_n)$ looks like

$$(\underbrace{u_1, \ldots, u_l}_{<k}, \underbrace{u_{l+1}, \ldots, u_{n-r}}_{=k}, \underbrace{u_{n-r+1} , \ldots, u_n}_{>k})$$ So we have $u_j = k$ iff $l +1 \leq j \leq n - r$, or: $l\leq j-1$, $r \leq n - j$. Since each $x$ uniquely determines $L_x, R_x$ we can write

\begin{align*}P(U_j = k) &= \sum_{L,R:|L| + 1 \leq j \leq n - |R|} P(L_X = L, R_X = R)\\ &= \sum_{l=0}^{j-1} \sum_{r=0}^{n - j} \binom{n}{l, r, n-l-r} p_{l,r}\end{align*}

where $p_{l,r} = P(L_X = L, R_X = R)$ for any $L,R$ with $|L| = l$, $|R| = r$ (the probability is the same for each by the iid assumption) and the multinomial coefficient $$\binom{n}{l, r, n-l-r}$$ is the number of possible disjoint $L,R$ with $|L| = l, |R| = r$.

For fixed, disjoint $L,R$ we have \begin{align*} P(L_X = L, R_X = R) &= P(X_i < k\, \forall i \in L,\, X_i > k\, \forall i \in R, X_i =k \forall i \notin L,R)\\ &= P(X_i < k\, \forall i \in L) \cdot P( X_i > k\, \forall i \in R) \cdot P(X_i =k\,\forall i \notin L,R)\\ \end{align*}

by independence. Then if $|L| = l, |R| = r$, we have $$P(X_i < k\, \forall i \in L) = ((k-1)/d)^l,$$ $$P(X_i = k\, \forall i \notin L,R) = (1/d)^{n - l -r},$$

$$P(X_i > k\, \forall i \in R) = ((d-k)/d)^r.$$

Thus $$P(U_j = k) = \sum_{l=0}^{j-1} \sum_{r=0}^{n - j} \binom{n}{l, r, n-l-r} \frac{(k-1)^l (d-k)^r}{d^n}. $$