Problem
We have three candy machines: call them G (good), B (bad) and M (mixed) . G always gives you a candy when you put 1\$. B never gives you a candy when you put 1\$. M gives you a candy with probability $1/2$ when you put 1\$. You want a candy and you approach the three machines but you don't know which one is G, B or M. You use the following strategy. You approach a random machine. If you don't get a candy in $n$ trials, you change a machine. If you don't get a candy from the second machine in $k$ trials, you change to the remaining machine. At most, we can pay for a candy $n+k+1$ \$. We want to calculate the expected cost of obtaining a candy using this strategy. Is it the case that $k=n=1$ is the optimal strategy, i.e. yielding the least expected cost of obtaining a candy?
Attempt of solution
Currently, my thinking is the following. Consider the indicator random variable $X_i = 1$ meaning that we pay 1\$ at stage $i$. Observe that:
- $P(X_1 = 1) = 1$ (we always pay 1\$ at the beginning)
- $P(X_2 = 1) = \frac{1}{3} + \frac{1}{3}\frac{1}{2}$ (you pay at stage $2$ iff you don't receive a candy at stage $1$ which happens either when you choose $B$ with probability one third or you choose $M$ with probability one third and then $M$ doesn't give you a candy with probability one half)
- $P(X_3 = 1) = \frac{1}{3} + \frac{1}{3}(\frac{1}{2})^2$ (similar justification as previously but keeping in mind that you consider an event of not obtaining a candy at stages $1$ and $2$)
- $\dots$
- $P(X_{n+1} = 1) = \frac{1}{3} + \frac{1}{3}(\frac{1}{2})^n$
- $P(X_{n+2} = 1) = \frac{1}{6} \frac{1}{2} + \frac{1}{6}(\frac{1}{2})^n$ (= the probability that we don't receive a candy at stages up to $n+1$. This could happen either when we first happen to choose B and then M, in which case the probability of not obtaining a candy in $n+1$ trials is $\frac{1}{6} \frac{1}{2}$, or when we first choose M and than B, in which case the probability of not obtaining a candy in $n+1$ trials is $\frac{1}{6}(\frac{1}{2})^n$)
- $\dots$
- $P(X_{n+k+1} = 1) = \frac{1}{6} (\frac{1}{2})^k + \frac{1}{6}(\frac{1}{2})^n$
We are interested in $E(\Sigma_{i=1}^{n+k+1} X_i)$. Using linearity of expectation, we may write:
$$E(\Sigma_{i=1}^{n+k+1} X_i) = 1 + E(\Sigma_{i=2}^{n+1}X_i) + E(\Sigma_{j=n+2}^{n+k+1}X_j) = $$
$$=1 + \Sigma_{i=1}^n(\frac{1}{3} + \frac{1}{3}(\frac{1}{2})^i) + \Sigma_{j=1}^k(\frac{1}{6}(\frac{1}{2})^j + \frac{1}{6}(\frac{1}{2})^n)=$$
$$= 1 + \frac{1}{3}n + \frac{1}{3}\Sigma_{i=1}^n\frac{1}{2^i} + k \frac{1}{6} \frac{1}{2^n} + \frac{1}{6}\Sigma_{j=1}^k \frac{1}{2^j}$$
Now, we can easily observe that when $n$ is fixed and we make $k$ bigger, $E$ also grows. This excludes strategies $n < k$ from being optimal. We can go somewhat further by observing that:
$$ \Sigma_{i=1}^n\frac{1}{2^i} = \frac{1 - \frac{1}{2}^{n+1}}{1 - \frac{1}{2}}- 1 = 1 - (\frac{1}{2})^n$$
This gives you:
$$E(X) = 1 + \frac{1}{3}n + \frac{1}{3}(1 - (\frac{1}{2})^n) + k \frac{1}{6} \frac{1}{2^n} + \frac{1}{6}(1 - (\frac{1}{2})^k)=$$
$$ 1 + \frac{1}{3}n + \frac{1}{3} - \frac{1}{3}(\frac{1}{2})^n + k \frac{1}{6} \frac{1}{2^n} + \frac{1}{6} - \frac{1}{6}(\frac{1}{2})^k$$
$$6 E(X) = 9 + 2n - 2 (\frac{1}{2})^n + k \frac{1}{2^n} - (\frac{1}{2})^k$$
$$6 E(X) = 9 + 2n + (k-2)\frac{1}{2^n} - (\frac{1}{2})^k$$
This makes it clear that when you keep $k$ fixed and make $n$ bigger, $E$ grows as well. This excludes strategies $k < n$ from being optimal. So the optimal strategy should satisfy $n=k$. So let's assume that in the equation for $E$:
$$6 E(X) = 9 + 2n + (n-2)\frac{1}{2^n} - (\frac{1}{2})^n=$$
$$ 9 + 2n + (n-3)\frac{1}{2^n}$$
I will not go further now, but, at least know, it's clear for me that the right hand side of the above equation assumes minimal value for $n=1$ which yields the strategy $n=k=1$ as optimal.
Clearly you always want $k = 1$. If you have tried two machines and gotten candy from neither, the last one must be the Good one, so there's no reason not to try that next. This leaves our choice of $n$. Intuitively it's clear that this one should be 1 as well: after failing to get a candy from the machine once, it's either Bad or Mixed with probability 1/2, while both other machines are Good with probability 1/2, so those machines definitely appear more appealing. But let's make this formal.
The strategy $n = k = 1$ nets you an expected value of $5/3$ tries until you obtain a candy. Now suppose that $n \geq 2$. Write $X$ for the number of tries, and $B, M, G$ for the event that the first machine you try is the Bad, Mixed, Good one. Then: $$ \begin{align*} \mathbb{E}(X) &= \underbrace{\mathbb{E}(X \mid B)}_{\geq n + 1}\mathbb P(B) + \underbrace{\mathbb{E}(X \mid M)}_{> 1}\mathbb P(M) + \underbrace{\mathbb{E}(X \mid G)}_{=1}\mathbb P(G)\\ &>(n+1)\frac13+ 1\times \frac13 + 1 \times \frac13\\ & \geq \frac53. \end{align*} $$ Here the "strictly greater than" on line 2 stems from the fact that the expected number of tries when the first machine is Mixed is actually strictly greater than 1. Thus $n = k = 1$ is optimal.