I have two hard drives. The given Mean to Failure Time is $100$ hours.
So if I have one hard drive, the average amount of time until it fails will be $100$ hours.
However, according to my textbook, if I have two hard drives the average amount of time until one fails is $100/2 = 50$.
I don't understand this conceptually. Why do we divide like this? Both hard drives are identical, so a second one will fail on average after 100 hours, sometimes a bit more sometimes a bit less. But the average time until failure should still be $100$ it would seem.
However, I also understand that as you add more hard drives, you are increasing the odds of something going wrong. Obviously you can't IMPROVE the average time until first failure by ADDING hard drives. So it makes sense from that perspective why more hard drives mean less average time until failure.
Yet I'm still confused at to why the formula is $(average\ time \ for\ one\ to\ fail) \div (number\ of\ drives)$
Why is this formula correct?
(Keywords: Poisson process; exponential distribution)
The basic idea is that you have events (failures) arriving on each drive at a mean rate of 1 per 100 hours.
If a machine has two drives, then the events are arriving on the machine at a mean rate of 2 per 100 hours.
The mean time until one event arrives on the machine is thus 50 hours.
Time til failure (a decay process) is usually modelled as an exponential distribution.
Thus the probability density function for failure time $T_n$ (/hour) of drive $n$ is: $$f_{n}(t_n) = 0.01 e^{-0.01 t_n}$$
So the expect time til first failure of drives 1,2 is: $$\begin{align} \mathsf E(\min(T_1,T_2)) & = \int_0^\infty\int_0^\infty \min(t_1,t_2) f_1(t_1)f_2(t_2)\operatorname d t_1 \operatorname d t_2 \\ & = \int_0^\infty \left(\int_0^{y} x f_1(x)\operatorname d x + \int_{y}^\infty y f_1(x)\operatorname d x \right)f_2(y)\operatorname d y \\ & = \frac {1}{10^4} \int_0^\infty \left(\int_0^{y} x e^{-0.01 x}\operatorname d x + y \int_{y}^\infty e^{-0.01 x}\operatorname d x \right)e^{-0.01 y}\operatorname d y \\ & = 50 \end{align}$$
Alternatively: since the time til failures are strictly positive random variables. $$\begin{align}\mathsf E(\min(T_1,T_2)) & = \int_0^\infty \mathsf P(\min(T_1, T_2)> t)\operatorname d t \\ & = \int_0^\infty \mathsf P(T_1> t)\cdot \mathsf P(T_2> t)\operatorname d t \\ & = \int_0^\infty e^{-t/100}\cdot e^{-t/100}\operatorname d t \\ & = \int_0^\infty e^{-t/50}\operatorname d t \\ & = 50 \end{align}$$
So we can see that it has to do with the shape of the exponential density function, which is denser for shorter times than longer (and it is not symmetrically distributed around the mean). So the function for the minimum of two such (independent) random variables is even denser for shorter times that longer.