Basic math, Confusion working out chance of failure.

141 Views Asked by At

While I am quite technical, I am not a mathematician. So please be forgiving if this question seems overly simple. My probability abilities left me long ago.

I have a problem which I was arrogant enough to think was simple until I done a few calculations that got me confused.

Lets say we have 2 systems (System A (SA), and System B(SB)). Upon which our overall system relies on.

A typical real world example would be deciding if you want your user to be able to download a library from google server - which is more likely to be cached on the users computer, or download the same library from your own server hosted in the cloud by amazon (on which all your other files reside anyway). This library is obviously something your own system depends on. Now you want to work out the risk of failure considering you have your libraries split across 2 places, instead of on one. However real world is not relevant, I just use that for context so you know where I came across the problem.

Assumption 1: The overall risk of failure if:

SA = 5% and SB = 3% = overall = 15%?

I am assuming this is correct, if not please let me know as this simple math is the basis for my question. Essentially, it is a multiplier, not a divisor or addition?

With assumption 1 assumed to be correct for now, my question is:

As we get closer to either number being 1, the risk of failure grows smaller but is still bigger than either one as long as both are above 1:

SA = 1% x SB = 3% = Overall = 3%

This seems simple, but it's based on assumption 1 being correct.

But once we go below 1 on any or both systems, the risk of failure actually goes down by multiplication, not up:

SA = 0.5% x SB = 0.5% = overall = 0.25%

This doesn't seem right to me, I would have thought it would be 2.5%. I believed the risk of failure should be greater than either of the systems failing combined but this math shows otherwise.

Is there some simple rule for working this kind of thing out consistently for numbers both greater and less than 1 which always give the larger result?

If we change to a divisor, the inverse happens:

SA = 5% / SB = 3% = Overall = 1.67% (smaller than both)
SA = 0.5% / SB = 0.5% = Overall = 1% (bigger than both)

This also introduces the potential of division by zero. As there is a possibility (however remote) that a system will never fail, and that system could in theory be SB. This (along with the number inversion) is the reasons that this approach doesn't feel right.

Or should I be using addition?

SA = 5% + SB = 3% = Overall = 7%
SA = 0.5% + SB = 0.5% = Overall = 1%

While addition consistently gives me larger numbers, I am unsure if this is correct for my circumstances as the last answer doesn't come close to the 2.5% expected, hence my uncertainty on this approach too.

Subtraction would always give a smaller result so I didn't even consider that.

Another reason I am asking is because google-fu comes up with all kinds of crazy formulas for assessing risk of failure on engineering systems, but even my probability at college didn't use formulas like that, so google seems to be failing me.

I'd appreciate any help.

3

There are 3 best solutions below

1
On

If the overall failure occurs if SA AND SB (both) fail, then you should multiply probabilities, but if the overall failure occurs if SA OR SB (at least one) fails, then you should add the probabilities up.

1
On

For independent events $A, B$,

$$ P(A\text{ and }B) = P(A) P(B). $$

For any two events,

$$ P(A\text{ or }B) = P(A) + P(B) - P(A\text{ and }B). $$

This can be derived either by considering a Euler-Venn diagram or as mentioned in a comment by @calculus.

Then in your example

$$ P(A\text{ or }B) = P(A) + P(B) - P(A) P(B) = 0.05 + 0.03 - 0.05 \cdot 0.03 = 0.0785 = 7.85\%. $$

1
On

First of all, never do any probability calculations with percentages. A percentage is just a nice way of saying certain decimal fractions; that is, we say $15\%$ but the actual number is $0.15$.

If $S_A$ has $3\%$ probability to fail, that's a $0.03$ probability. If $S_B$ has $5\%$ probability to fail, that's a $0.05$ probability. If the overall system fails only when both $S_A$ and $S_B$ fail, then the overall probability of failure is correctly computed by multiplying the individual probabilities, but it looks like this:

$$ 0.03 \times 0.05 = 0.0015.$$

That is, the probability that both $S_A$ and $S_B$ fail is $0.15\%$.

In general, if you have redundant subsystems, it will decrease the probability of failure of the overall system. (This assumes the subsystems really are redundant, that is, the overall system can function correctly even when one of the individual subsystems fails.)

If both subsystems are needed by the overall subsystem, that is, if the overall subsystem will fail if just one of the subsystems fails, then the thing to do is to assess the probability that the system will not fail. In this case you have $1 - 0.03 = 0.97$ probability that $S_A$ will not fail, and a $1 - 0.05 = 0.95$ probability that $S_B$ will not fail. The probability that neither system fails (that is, $S_A$ does not fail and $S_B$ does not fail) is

$$ 0.97 \times 0.95 = 0.9215, $$

so the probability that the system fails is $1 - 0.9215 = 0.0785$, which is $7.85\%$. That is, when you have a system of multiple components that cannot tolerate the failure of any single component, the probability of system failure is greater than that of any subsystem.