Expected duration of completing a task experiencing exponential distributed failures

288 Views Asked by At

Let's say we have a task whose duration without any restarts is T. And also assume the task faces exponentially distributed restarts/failures with rate of $\lambda$. After each restart it has to wait for a duration of $R$ and then start again from the beginning.

Now I want to compute the expected duration of the task considering failures. Here I found an approach which estimates the expected duration of completing the task as follows:

$E(T) = (1/\lambda)(1 - (1+T\lambda)e^{-T\lambda})(e^{T\lambda}-1)+T$

But this approach does not involve the delay $R$ after each restart. I thought maybe the most naive solution would be to simply multiply the expected number of restarts during $E(T)$ by $R$ which delivers the expected total delay duration until completion as follows:

$total$ $duration = E(T) + E(T)\lambda R$

But I believe this is not a good estimation because the probability of occurring failures during $R$ is ignored in this estimation.

Has anyone any ideas how to compute the expected completion duration?

2

There are 2 best solutions below

8
On BEST ANSWER

Using notation from this answer (see Iverson bracket), we can write the wait $W$ before the task is completed as

$$ W = \sum_{i=1}^{+\infty}\,D_i\cdot[D_1\leqslant T, D_2\leqslant T+R, \ldots,D_i\leqslant T+R], $$

where the $D_i$ are i.i.d. with exponential distribution of parameter $\lambda$ and represent the gap in time between failure $i-1$ and failure $i$ $($with failure $0$ being the very beginning, of course$)$. By independence, we have that for $i\geq 2$:

$$\mathbb E(D_i\,|\,D_1\leqslant T, D_2\leqslant T+R, \ldots,D_i\leqslant T+R) = ab^{i-2}c$$

where

\begin{align} &a = \mathbb P(D\leqslant T) = 1 - e^{-\lambda T}\\ &b = \mathbb P(D\leqslant T+R) = 1 - e^{-\lambda (T+R)}\\ &c = \mathbb E(D\,|\,D\leqslant T+R) = \int_0^{T+R}\,x\lambda e^{-\lambda x}\,dx = \frac{1-e^{-\lambda (T+R)}\big(1+\lambda (T+R) \big)}{\lambda} \end{align}

For $i=1$, we have

$$d:= E(D_1\,|\,D_1\leqslant T) = \int_0^{T}\,x\lambda e^{-\lambda x}\,dx = \frac{1-e^{-\lambda T}\big(1+\lambda T \big)}{\lambda}. $$

Hence

\begin{align} \mathbb E(W) &= d + ac\,\sum_{j\geq 0}\,b^j\\ &= d + \frac{ac}{1-b} \end{align}

If my calculations are correct, this yields

\begin{align} \mathbb E(W) &= \frac{e^{\lambda(T+R)}+\lambda Re^{-\lambda T}-\lambda(T+R)-e^{\lambda R}}\lambda \\&= \frac{e^{\lambda R}\left(e^{\lambda T}-1\right)}{\lambda} - T - R\left(1-e^{-\lambda T}\right), \end{align}

and notice that if $R=0$ we recover the original answer.

Now, besides the wait $W$, we also need an additional time $X$ for actually completing the task. If no restarts occurred, then this additional time is simply $T$, otherwise, we need $T+R$. In other words:

$$X = T + R\cdot[D_1\leqslant T]$$

with expectation

$$\mathbb E(X) = T + R\cdot \mathbb P(D_1\leqslant T) = T+Ra = T+R\left(1-e^{-\lambda T}\right).$$

Therefore, the total time needed for completing the task simplifies to

$$\mathbb E(W+X) = \frac{e^{\lambda R}\left(e^{\lambda T}-1\right)}{\lambda}$$

Does this match your simulations? \begin{equation} %------------------ % %>**The answer below is incorrect; see the comments for clarifications.** % %Let $N$ denote the number of restarts and $X$ the time between restarts. %Let $\lambda$ be the (constant) rate at which restarts occur. %Then $N$ has Poisson distribution and $X$ has exponential distribution, both %with parameter $\lambda$, and the expected time taken is % %$$\sum_{n\geq 0}\, %\underbrace{\mathbb P(N=n)}_{\text{exactly $n$ restarts occur}} %\cdot %\underbrace{\Big(T + n\left(R + \mathbb E(X)\right)\Big).}_{\text{average time %taken to complete in exactly $n$ restarts}}$$ % %Now, $\mathbb E(X) = \frac1\lambda$ and $\mathbb P(N=n) = e^{-%\lambda}\,\frac{\lambda^n}{n!}$. %Hence the expected time taken is % %$$\sum_{n\geq 0}\,e^{-\lambda}\,\frac{\lambda^n}{n!} %\cdot %\left(T + n\left(R+\frac1\lambda\right)\right).$$ % %We have % %$$Te^{-\lambda}\,\sum_{n\geq 0}\,\,\frac{\lambda^n}{n!} %=Te^{-\lambda}\,e^\lambda = T,$$ % %indicating of course that the task takes *at least* $T$ time to complete. %The time corresponding to the restarts is % %\begin{align} %e^{-\lambda}\,\sum_{n\geq 0}\,\frac{\lambda^n}{n!} %\cdot %n\left(R+\frac1\lambda\right) %&= %e^{-\lambda}\,\sum_{n\geq 1}\,\frac{\lambda^n}{n!} %\cdot %n\left(R+\frac1\lambda\right) %\\&= %e^{-\lambda}\,\sum_{n\geq 1}\,\frac{\lambda^n}{n!} %\cdot %n\,\left(\frac{R\lambda+1}{\lambda}\right) %\\&= %e^{-\lambda}(R\lambda+1)\, %\sum_{n\geq 1}\,\frac{\lambda^{n-1}}{(n-1)!} %\\&= %e^{-\lambda}(R\lambda+1)\, %\underbrace{\sum_{n\geq 0}\,\frac{\lambda^{n}}{n!}}_{e^{\lambda}} % = R\lambda + 1 %\end{align} % %which should gives us an expected time of % %$$T+R\lambda +1.$$ \end{equation}

0
On

I found another stack post where the expected waiting time to have a minimum inter-arrival time of $T$ between failures is defined as follows:

$W(T) = \frac{eλT−1−λT}{λ}$

This means $W(T)+T$ should deliver the expected duration of completing the task (without considering delay $R$). This opposes the calculations done here (also mentioned in the question). However my simulations show that $W(T)+T$ delivers the correct answer.

Now to change it in order to consider delay time $R$ I have done calculations as follows:

$E(x) = W(x) + x$ $\texttt{ }\texttt{ }\texttt{ }\texttt{as described above}$

$P(t > x) = e^{-\lambda x}$ $\texttt{ }\texttt{ }\texttt{ }\texttt{ the probability of having an inter-arrival}$

$\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ time greater than x (The first failure occurs}$

$\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ after completion)}$

$E(T, R) = P(t > T)T + (1 - P(t > T))E(T+R)$

Which delivers the probability of completing the task without having to face any restarts (and thus having to wait extra delay $R$) multiplied by $T$ (task duration without restart delay $R$) plus the probability of facing at-least one restart and having to wait for a minimum inter-arrival time of $T+R$.

The simulations show that this is a relative good estimation.