Dynamic programming problem

2k Views Asked by At

A man is in a room, with $n$ passages leading out. For passage $i$, $i = 1,...,n$, there is probability $p_i$ of escaping, $q_i$ of being killed and $r_i$ of returning to the room, where $p_i + q_i + r_i = 1$. In which order should the man try the passages so as to maximise his chances of eventual escape?

I am trying to formulate this as a dynamic programming problem, but am not sure how to associate costs with it. I am also not sure if I need to use discounting for the "death" scenario.

Any help would be greatly appreciated.

3

There are 3 best solutions below

0
On

The outcomes of this process are strings of $P_i$, $Q_i$, $R_i$, where $P_i$ means escape through passage $i$, $Q_i$ means die through passage $i$, and $R_i$ means return through passage $i$. The string terminates as soon as we hit $P_i$ or $Q_i$.

So a typical outcome, if $n=4$, might look like $R_1 R_3 R_4 R_2 Q_2$. If we associate $P_i$ with $p_i$ and so on then, by independence, the probability of each outcome is simply the string interpreted as a product: $$\Pr(R_1 R_3 R_4 R_2 Q_2) =r_1 r_3 r_4 r_2 q_2. $$

(It is also possible to have an infinite string of $r$'s, but if $r_i<1$ for each $i$ then the probability of these are all 0.)

Now let us ask all the ways we can escape in at most 2 moves. That's all the strings $P_i$ and all the strings $R_iP_j$. These pair up into the two escape outcomes of strategies $(i,j)$ where $(i,j)$ means first try the $i$th passage then try the $j$th passage. Now the probability of escape with strategy $(i, j)$ is $$\Pr(P_i)+\Pr(R_iP_j) = p_i + r_i p_j. $$

By the same reasoning, the probability of escape with strategy $(i_1, i_2, i_3, i_4, \ldots)$ is $$ p_{i_1} + r_{i_1} p_{i_2} + r_{i_1} r_{i_2} p_{i_3} + r_{i_1} r_{i_2} r_{i_3} p_{i_4} +\cdots, $$ an infinite series. So in general the problem seems a bit intractable to me.

If each passage can only be tried once then you need only consider $(1, 2, 3, \ldots, n)$ and all its permutations. A computer program could compute all those sums and select the maximum. However, as $n$ increases the number of such strategies ($n!$) increases very quickly.

0
On

Make best use of your resources. Or, in other words minimize cost. You only fail to escape if you die, but you have greater chance of dying if you fail to escape: because you must risk your life again.

Try that passage which has greatest positive difference between increase in probability of escape if going by another passage and increase in probability of death if going by another passage. Then try the passage with the next such greatest difference, and so on, assuming you can try every passage once. Or repeat the first passage if many tries are allowed.

This would maximize chance of escape. So this is an iterative solution in general.

0
On

If you only have two passages, your chance of escape is $p_a+r_ap_b$ or $p_b+r_bp_a$.
Use $r_a=1-p_a-q_a$, this is $p_a+p_b-q_ap_b-p_ap_b$ or $p_a+p_b-p_ap_b-q_bp_a$.
So choose passage $A$ if $q_ap_b<p_aq_b$, or $p_a/q_a>p_b/q_b$.
This means that you should choose in decreasing order of $p_i/q_i$.