The system I am trying to model is as follows:
- A program takes time $T$ to run
- The computer crashes on average every $\text{MTBF}$ time units
- The program is repeatedly run until it completes without a crash
- What is the expected time to completion?
I tried to calculate this by working out how many times the program is expected to fail before it completes:
If failures are modelled by a Poisson process of rate $\frac{1}{\text{MTBF}}$ then the probability of no failure occurring during the program is $e^{-\frac{T}{\text{MTBF}}}$
Therefore the number of failures until a success is modelled by a Geometric distribution and hence the expected number of failures is $e^{\frac{T}{\text{MTBF}}}-1$ (by standard result that the expectation $=\frac{1-p}{p}$)
Therefore the time to completion is given by $(e^{\frac{T}{\text{MTBF}}}-1)*t+T$ where $t$ is the expected length of each failed run. But I'm not sure how to calculate this value $t$? Is this just going to be the expected time between events in the Poisson process conditioned on this time being $<T$?
Is there a more direct way of working out this expected time until a gap $T$ between two Poisson process events?
You already have the value of t, you state it in bullet #2 in your premise.
The expected length of each failed run equals how long (on average) it runs before it crashes. So, t = MTBF.
No.
In your Poisson distribution, you use λ = T/MTBF (a time rate).
The expected number of crashes during time T therefore is T/MTBF.
I was initially tempted to calculate the "inter crash rate" of 1/λ (i.e. MTBF/T) but this doesn't apply in your case.
Since you start over the process after a crash, you "re-set" the timer on the Poison process.