An Interesting Probabilistic Game

303 Views Asked by At

I heard this brainteaser from someone else and found it is interesting. Can someone help me on solving this problem? Thanks so much for your help...

There are n slots totally. You have access to infinitely many balls and put balls one-by-one into the slots randomly - each ball has equal probability to get into each slot. You can stop early anytime. In the end your pay out will be like this: for each 1-ball slot, you are rewarded 1; for each k-ball slot (k>=2), you are penalized k; for each 0-ball slot you are rewarded nothing. What is your optimal strategy? And what is the expected value of the game under the optimal strategy?

2

There are 2 best solutions below

1
On

Let $Y$ be the payout. Let $Z$ be the count of balls. We want to find the value of $N$ when the expected payout given the count of balls used is a maximum.

$$\frac{\mathrm d~}{\mathrm d z}\mathsf E(Y\mid Z=z) = 0$$

Let $X_i$ be the count of balls in slot $i$. $$X_i\mid Z{=}z ~\sim~\mathcal{Bin}(z, 1/n)$$

Let $Y_i$ be the payout for that slot. $Y_i=\begin{cases} 1 & :X_i=1 \\ -X_i & :\textsf{otherwise}\end{cases}$

Then $$\begin{align} \mathsf E(Y\mid Z{=}z) ~=~& \sum_{i=1}^n \mathsf E(Y_i\mid Z{=}z) \\[1ex] ~=~& \sum_{i=1}^n \Big(\mathsf P(X_i{=}1\mid Z{=}z)-\mathsf P(X_i{\neq} 1\mid Z{=}z)~\mathsf E(X_i\mid X_i{\neq}1 , Z{=}z)\Big) \\[1ex] ~=~& \sum_{i=1}^n \Big(2~\mathsf P(X_i{=}1\mid Z{=}z)-\mathsf E(X_i\mid Z{=}z)\Big) \end{align}$$

...can you complete?

0
On

(I'm posting this as an answer to expand on my comments, because it's a little long)

Essentially, I think there's something very ambiguous in the statement of the problem.

It says "you can stop early any time". This could mean that you could also go on forever, as in adjust your strategy as you go, or that you have an upper bound on the number of moves you can make, determined by a predetermined strategy, but that you can still stop early.

In the latter case, I think stopping "early any time" doesn't make sense; because once you compute the overall EV of a predetermined strategy, the EV of the moves that come earlier than the optimal number of moves predetermined by your strategy is already taken into account by your EV calculation, so there would never be any value in "stopping early".

In other words, if we're talking about a semi-predetermined strategy in which we have flexibility with our lower bound of number of moves, the option of "stopping early" is useless and mentioning it explicitly doesn't make sense. If on the other hand we can adjust our strategy as we go, "stopping early" is already implied as an option and mentioning it is redundant. So that part is weird to me.

Let's say the problem states that you can keep going for as long as you want, implying adjusting your strategy as you go. Then I don't think you can speak of an optimal strategy in a vaccum. Essentially, you can't come up with an optimization beforehand for a non-predetermined strategy.

For example (same as my comment): Let's say you have 6 slots. The optimal predetermined strat is to place 2 balls, which yields EV of 4/3, after which the EV goes down more and more.

However, if you can adjust your strat as you go; let's say you get unlucky and your second ball goes into the same slot as the first. At this juncture you have a 1/6 chance of losing another point and a 5/6 chance of winning a point, so obviously you should place another ball. Let's say you get super unlucky and end up with 1000 balls in the same slot; you should place yet another ball because your EV is still exactly the same and positive.

In other words, the optimal strategy if you can adjust as you go is to just compute your EV after every move.

If instead we're constrained to a predetermined strategy, I think Graham's answer addresses that.