A man is in a room, with $n$ passages leading out. For passage $i$, $i = 1,...,n$, there is probability $p_i$ of escaping, $q_i$ of being killed and $r_i$ of returning to the room, where $p_i + q_i + r_i = 1$. In which order should the man try the passages so as to maximise his chances of eventual escape?
I am trying to formulate this as a dynamic programming problem, but am not sure how to associate costs with it. I am also not sure if I need to use discounting for the "death" scenario.
Any help would be greatly appreciated.
The outcomes of this process are strings of $P_i$, $Q_i$, $R_i$, where $P_i$ means escape through passage $i$, $Q_i$ means die through passage $i$, and $R_i$ means return through passage $i$. The string terminates as soon as we hit $P_i$ or $Q_i$.
So a typical outcome, if $n=4$, might look like $R_1 R_3 R_4 R_2 Q_2$. If we associate $P_i$ with $p_i$ and so on then, by independence, the probability of each outcome is simply the string interpreted as a product: $$\Pr(R_1 R_3 R_4 R_2 Q_2) =r_1 r_3 r_4 r_2 q_2. $$
(It is also possible to have an infinite string of $r$'s, but if $r_i<1$ for each $i$ then the probability of these are all 0.)
Now let us ask all the ways we can escape in at most 2 moves. That's all the strings $P_i$ and all the strings $R_iP_j$. These pair up into the two escape outcomes of strategies $(i,j)$ where $(i,j)$ means first try the $i$th passage then try the $j$th passage. Now the probability of escape with strategy $(i, j)$ is $$\Pr(P_i)+\Pr(R_iP_j) = p_i + r_i p_j. $$
By the same reasoning, the probability of escape with strategy $(i_1, i_2, i_3, i_4, \ldots)$ is $$ p_{i_1} + r_{i_1} p_{i_2} + r_{i_1} r_{i_2} p_{i_3} + r_{i_1} r_{i_2} r_{i_3} p_{i_4} +\cdots, $$ an infinite series. So in general the problem seems a bit intractable to me.
If each passage can only be tried once then you need only consider $(1, 2, 3, \ldots, n)$ and all its permutations. A computer program could compute all those sums and select the maximum. However, as $n$ increases the number of such strategies ($n!$) increases very quickly.