Find the optimal weighting distribution for a die given a set prize values

53 Views Asked by At

Say you have $B$ boxes, each of which contains a prize of value $v_b$. You have a $B$-sided die, and you win the value of box $b$ by rolling $b$. You get $R$ rolls, and so can collect multiple prizes, but you can’t win the same box more than once.

Given you know $R$, $B$, and all the box values $v_b$,how do you choose to weight your die as to optimize your expected total reward?

My attempts so far I've tried two approaches. Both share the definition of expected value:

$U = \sum_b \sum_r u_{b,r} $

Where

$u_{b,r} = v_b (1 - w_b)^{r-1}w_b$

The difference in my two (failed) methods is in the expression of the constraints. In one method, I let

$w_b = W_b/\sum_i W_i$

and find the critical points by differentiating with respect to every $W_b$ and setting to zero.

In the other method, I use Lagrange multiplier to express the constraint. But in both, I keep ending up with contradictions.

Let's try a reduced version at the problem: I'm even having a problem with the very simple case where R=1 and B=2 is. In this case, the expected value is

$U = v_1w_1 + v_2w_2$

and the constraint is

$w_1 + w_2 = 1$

The answer I'd expect is that $w_1=1$ if $v_1 > v_2$ and $w_1=0$ if $v_1 < v_2$ But, using the Lagrange multiplier method I get

$L = v_1w_1 + v_2w_2 - \lambda (1 - w_1 - w_2)$

and thus the useless set of constraints

$0=v_1 - \lambda$

$0=v_2 - \lambda$

$0=1 - w_1 - w_2$

It's been a very long time since I've had to think about multivariable calculus, so I suspect I'm doing something silly. Since this simplified function is just a plane, I'm not sure the differential method of finding maxima should work, even with the constraint. If I understand my problem with this simple case, maybe I can figure it out for the general problem.

Context I'm working on an analogous problem in the context of developing a resource distribution program.

1

There are 1 best solutions below

0
On

The attempt at a solution above is flawed, due to the fact that it doesn't take into account that each box can be only earned once. This creates a natural notion of stopping time within the problem that changes it's structure completely.

Let's keep this to $B=2$ now and generalize later. The weights of the die being $p_1, p_2$, we would like to calculate the expectation value of the winnings after $R$ attempts. After $R$ attempts it is still possible that the game has not ended yet; the strings of rolls (11...1), (22...2) satisfy the condition. These are also the only strings in the game that earn you less than $v_1+v_2$ (all the others eventually stopped before the R-th attempt yielding the full amount). The winnings EV is

$$\langle W_2\rangle=\sum_i W_i P_{i}=v_1 p(11...1)+v_2p(22...2)+(v_1+v_2)(1-p(11..1)-p(22...2))$$

or more explicitly $\langle W\rangle=v_1(1-p_2^R)+(1-p_1^R)v_2$

We can now use Lagrange multipliers on this to yield that the values that maximize the average winnings are

$$p_1=\frac{v_1^{\frac{1}{R-1}}}{v_1^{\frac{1}{R-1}}+v_2^{\frac{1}{R-1}}}~~, ~~p_2=\frac{v_2^{\frac{1}{R-1}}}{v_1^{\frac{1}{R-1}}+v_2^{\frac{1}{R-1}}}$$

For $B=3$ we consider the same setup and find the strings that do not terminate the game after $R$ attempts. These are comprised of the strings that contain only one or two of the 3 potential outcomes. Counting the probability of those strings is relatively simple and the expected winnings are

$$\langle W_3 \rangle=\sum_{i=1}^3 v_ip_i^R+(v_1+v_2)(p_1+p_2)^R+(v_2+v_3)(p_2+p_3)^R+(v_3+v_1)(p_3+p_1)^R+(v_1+v_2+v_3)(1-\sum_i p_i^R-(p_1+p_2)^R-(p_2+p_3)^R-(p_3+p_1)^R)$$

This already has become too difficult to maximize analytically. This can be generalized to a die of arbitrary number of sides $B$- just consider $S=$ all the $2^B-2$ subsets of the sets $\{1,2,..., B\}$ that are not empty and not the set itself and the winnings expected are

$$\langle W_B\rangle=\sum_{s\in S}\left[\left(\sum_{i\in s} v_i\right)\left(\sum_{j\in s}p_j\right)^R\right]+\left(1-\sum_{s\in S}\left(\sum_{i\in s}p_i\right)^R\right)\sum_{j=1}^{B}v_j$$

Still, other than the obvious situation when $v_1=v_2=...v_B$ where the maximization is achieved by $p_1=p_2=...=p_B=\frac{1}{B}$ for other winnings per box, the solution is not to my knowledge available analytically, and one has to resort to computational methods.