Optimizing food gained. Two different areas.

81 Views Asked by At

Suppose a food court has two areas. In Area $1$ you will receive food at a rate of $0,2$ food per seconds. In Area $2$ you will receive food at a rate of $1$ food per seconds. While being in Area $2$, you have a probability of getting kicked out of the food court with probability $p > 0$ per seconds. If kicked out you cannot return to any of the areas. Say you have $10$ minutes, and want to optimize the amount of food gained. How do you distribute your time between the two areas? Should you ever switch from one area to the other? When and how often?

The answer depends on $p$.

3

There are 3 best solutions below

5
On

Suppose you decide to stay $n$ seconds in Area 2. Your best strategy is to spend those seconds after staying $600-n$ seconds in Area 1, because after you are being kicked out, you can't go back.

So you will get $0.2\times(600-n$ food from Area 1. Then, in Area 2, you'll get $0$ food with probability $p$ (you are immediately kicked out), $1$ food with probability $qp$ ($q=1-p$, you are kicked after 1 second), ... $k$ food with probability $q^kp$, $n-1$ food with probability $q^{n-1}p$, and $n$ food with probability $q^n$.

So your expected gain with this strategy is : \begin{align}E_n &=120-0.2n+qp+2q^2p+\dots+(n-1)q^{n-1}p+nq^n \\ &= 120-0.2n + pq(1+2q+\dots+(n-1)q^{n-2})+nq^n\end{align} Now $$1+2q+\dots+(n-1)q^{n-2}=(1+q+q^2+\dots+q^{n-1})'=\left(\frac{1-q^n}{1-q}\right)'=\frac{(n-1)q^n-nq^{n-1}+1}{(1-q)^2}$$ So, if my math is right : $$E_n=120-0.2n+\frac{(2n-1)q^{n+1}+1}{1+q}$$ Let's study this function of $n$ : $$E_n'=-0.2+\frac{q^{n+1}}{1+q}(2+\ln q)$$ This derivative cancels when $$q^{n+1}=\frac{0.2(1+q)}{2+\ln q}$$ i.e. $$n=\frac1q\ln\left(\frac{0.2(1+q)}{2+\ln q}\right)$$ Can't go much further, I'm afraid...

1
On

Given s remaining seconds consider the EV of

Stay in area 1: $s*0.2$

Spend 1 second in area 2 then stay in area 1: $(1-p)*(1+(s-1)*0.2)$

We should preference the strategy which maximizes the EV and must reconsider this choice for each passing second.

For example: evaluate at s=600, choose best option, evaluate at s=599, choose best option, evaluate at s=598, choose the best option...

Intuitively we recognize strategy 1 favors larger s while strategy 2 favors smaller s. Because s is always decreasing it follows that should we ever reach a point where strategy 2 is favored it will remain favored for the remainder of the duration.

Both strategies will yield the same EV when

$s*0.2 = (1-p)*(1+(s-1)*0.2)$

$s= 4/p-4$

So, we should stay in area 1 until $4/p-4$ remaining seconds and then switch to area 2.

Notice that if p is sufficiently large we should never switch to area 2. Conversely if p is sufficiently small we might begin (and remain) at area 2 for the entire duration.

0
On

I agree with some of the observations already done, but I disapprove the discrete approach so I'll follow a continous one.

If there is a moment when it is convenient to be in Area $2$, than it won't ever be convenient to go (back) to Area $1$: time spent in Area $1$ will always be more beneficial if spent before any time spent in Area $2$, where you risk being kicked out. Let's call $\Delta t_2$ the time interval that might be convenient to try spending in Area $2$. In such a time interval, Area $1$ would grant a food gain equal to $0.2 \, food/s \cdot \Delta t_2$. I don't like carrying numerical values, so I'll call $g_1 = 0.2 \,food/s$ and $g_2 = 1 \,food/s$, moreover $T = 600 \,s$. Now we have to evaluate the expected food gain in Area $2$ and maximize the difference.

I'm going to call $\tau$ the time spent in Area $2$, that is $\tau = t - \Delta t_1$. In Area $2$ there is a constant kick-out rate (probability per unit time), thus we have an exponential kick-out probability density function $f(\tau) = pe^{-p\tau}$. The expected survival time $E[\tau]$ ia evaluated as follows:

$$ \begin{align} E[\tau]&=\int_0^{\Delta t_2}\tau \cdot p e^{-p\tau} d\tau + \Delta t_2 \cdot \int_{\Delta t_2}^\infty p e^{-p\tau}d\tau=\\ &=\left[\tau(-e^{-p\tau}) \right]_0^{\Delta t_2}-\int_0^{\Delta t_2}(-e^{-p\tau}) d\tau \;-\Delta t_2\cdot \left[e^{-p\tau} \right]_{\Delta t_2}^\infty=\\ &=-\Delta t_2 \, e^{-p\Delta t_2} \;- \left[\frac{1}{p}\cdot e^{-p\tau} \right]_0^{\Delta t_2} \; + \Delta t_2 \, e^{-p\Delta t_2}=\\ &=-\frac{1}{p}\cdot e^{-p\Delta t_2}+\frac{1}{p}=\\ &=\frac{1}{p}\left(1-e^{-p\Delta t_2}\right) \end{align} $$

Let's check for correct limits: $$ \begin{align} p\rightarrow 0 &\Rightarrow E[\tau]\rightarrow \Delta t_2\\ p\rightarrow \infty &\Rightarrow E[\tau]\rightarrow 0\\ \end{align} $$

It's time to assess the food gain increment (note that $G_1$ is not the food gained during the first time interval, but what would be gained staying in Area $1$ during $\Delta t_2$): $$ \begin{align} \Delta G = G_2-G_1 &= g_2 \cdot \frac{1}{p}\left(1-e^{-p\Delta t_2}\right) - g_1 \cdot\Delta t_2\\ \frac{\Delta G\cdot p}{g_2} &= 1-e^{-p\Delta t_2} - \frac{g_1}{g_2} \cdot p\Delta t_2\\ y &= 1 - e^{-x} - \frac{g_1}{g_2} \cdot x\\ \end{align} $$ where the last line follows a substitution of variables to underline the function that has to be maximized. Note that $y(0)=0$ and that $y(x)$ is positive up to a certain value of $x$, so that we are sure to encounter a maximum. So,

$$ y'= e^{-x}-\frac{g_1}{g_2} \\ $$ $$ \begin{align} y'\geqslant 0 \iff& e^{-x}\geqslant\frac{g_1}{g_2}\\ &-x\geqslant \ln \frac{g_1}{g_2}\\ &\;\; x\leqslant \ln \frac{g_2}{g_1}\\ \end{align} $$ Finally, $t^* = T-\Delta t_2 = T-\frac{1}{p} \ln \frac{g_2}{g_1}$ is the ideal time to move to Area $2$, provided that $g_2 \gt g_1$, otherwise it woudn't obviously be convenient to ever switch to Area $2$. At the opposite extreme, if $ \Delta t_2 \gt T$, the best strategy is to spend all the time in Area $2$.

With the actual values of the question, we have $t^* = 600 \, s - 1/p * \ln\frac{1}{0.2}$ and it would be beneficial to start in Area $2$ since the beginning whenever $p\leqslant 2.68 \cdot 10^{-3} \, s^{-1}$.