$P(\text{duck})$ given $P(\text{look})$ and $P(\text{swim})$ and $P(\text{quack})$

258 Views Asked by At

"If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a duck"

We have the 3 probabilities:

$P(\text{look}) = 80\%$
$P(\text{swim}) = 70\%$
$P(\text{quack}) = 90\%$

I would assume then that $P(\text{duck}) = P(\text{look})·P(\text{swim})·P(\text{quack})$ $= 0.8 · 0.7 · 0.9$ $= 0.504$

This is obviously dead wrong. Common sense expects the resulted probability close to $1$. What would be right operation among the individual probabilities to represent the original saying correctly?

Update:

Giving some context: We have a problem in an industrial environment, where we have to validate if our calculated values are true or not. We can conduct a few independent measurements, each addresses a different aspect of the same setup and each measurment returns a yes or no with a probability. Then we have to summarize them and compare to the theoretical calculations. What we do is duck typing.

Update 2:

While reading up the answers did hit me: the more independent measurement events we do on properties of the duck, should prove or disprove the "duckness" of the object. Should we have 6 independent measurements (look, swim, quack, fly, eat, walk), each around probability of 0.1, our calculation should conclude that this is certainly not a duck. Should we get confirming measurements (higher than 0.5), I expect higher "duckness" probability.

4

There are 4 best solutions below

1
On BEST ANSWER

The updates to the question imply that what you really want to do is classify an object based on independent measurements of the object. The solution to that problem is known as the Naive Bayes classifier. The math is similar to heropup's answer but without the constraint $P(L ~|~ D) = 1$. Instead we start by writing Bayes' theorem in a convenient way:

$$ \frac{P(D ~|~ L)}{P(\neg D ~|~ L)} = \frac{P(D)}{P(\neg D)} \frac{P(L ~|~ D)}{P(L ~|~ \neg D)} $$

This extends easily to multiple independent measurements:

$$ \frac{P(D ~|~ L,S,Q)}{P(\neg D ~|~ L,S,Q)} = \frac{P(D)}{P(\neg D)} \frac{P(L ~|~ D)}{P(L ~|~ \neg D)} \frac{P(S ~|~ D)}{P(S ~|~ \neg D)} \frac{P(Q ~|~ D)}{P(Q ~|~ \neg D)} $$

Now use the first formula to simplify the second:

$$ \frac{P(D ~|~ L,S,Q)}{P(\neg D ~|~ L,S,Q)} = \left(\frac{P(D)}{P(\neg D)}\right)^{-2} \frac{P(D ~|~ L)}{P(\neg D ~|~ L)} \frac{P(D ~|~ S)}{P(\neg D ~|~ S)} \frac{P(D ~|~ Q)}{P(\neg D ~|~ Q)} $$

Plugging in $P(D ~|~ L) = 0.8$ and so on gives:

$$ \frac{P(D ~|~ L,S,Q)}{P(\neg D ~|~ L,S,Q)} = \left(\frac{P(D)}{P(\neg D)}\right)^{-2} \frac{0.8}{1-0.8} \frac{0.7}{1-0.7} \frac{0.9}{1-0.9} $$

Even with the independence assumption, we cannot answer the question without $P(D)$. For simplicity, I will take $P(D) = 0.5$, giving

$$ \frac{P(D ~|~ L,S,Q)}{P(\neg D ~|~ L,S,Q)} = 84 \\ P(D ~|~ L,S,Q) = \frac{84}{84 + 1} = 0.9882 $$

This approach has the intuitive quality that measurements with probability >0.5 will increase your confidence in $D$, measurements with probability <0.5 will decrease your confidence in $D$, and measurements with probability exactly 0.5 will have no effect (since this is the same as the prior probability $P(D)$ that I assumed).

3
On

Assume look, swim and quack are independent, then the required probability is

$P(L\cup S\cup Q) =P(L)+P(S)+P(Q)-P(L\cap S)-P(S\cap Q)-P(L\cap Q)+P(L\cap S\cap Q)=0.994$ $P(S\cap Q) = P(S)\times P(Q)$ and so are others

6
On

If our probabilities are independent (arguable).

Then suppose we have a object that looks, swims and quacks like a duck.

What is the probability that this object is not a duck?

$(1-0.8)(1-0.7)(1-0.9) = 0.006$

Which means that the probability it is a duck is $1-0.006 = 0.994$

1
On

The confusion, both in your question and in several of the responses, arises from improper notation for, and definitions of, the relevant events.

We first have to define appropriate conditional probabilities. Let $L$, $S$, and $Q$ represent the events that a randomly selected object looks, swims, and quacks like a duck, respectively. Let $D$ represent the event that a randomly selected object is in fact a duck.

Then the given probabilities are properly expressed as

$$\Pr[D \mid L] = 0.8 \\ \Pr[D \mid S] = 0.7 \\ \Pr[D \mid Q] = 0.9.$$

We will also impose another set of assumptions, namely that $$\Pr[L \mid D] = \Pr[S \mid D] = \Pr[Q \mid D] = 1,$$ which means a duck, with certainty, looks, swims, and quacks like a duck. Hence $$0.8 = \Pr[D \mid L] = \frac{\Pr[L \mid D]\Pr[D]}{\Pr[L]} = \frac{\Pr[D]}{\Pr[L]},$$ or $\Pr[D] = 0.8 \Pr[L]$. Similarly, $$\Pr[D] = 0.7 \Pr[S], \\ \Pr[D] = 0.9 \Pr[Q].$$ Consequently, assuming $L$, $S$, and $Q$ are independent events, $$\Pr[L \cap S \cap Q] = \frac{\Pr[D]^3}{(0.8)(0.7)(0.9)}.$$ This implies $$0 \le \Pr[D] \le \frac{\sqrt[3]{63}}{5}.$$

Now, you want to compute $$\Pr[D \mid (L \cap S \cap Q)] = \frac{\Pr[(L \cap S \cap Q) \mid D]\Pr[D]}{\Pr[L \cap S \cap Q]}.$$ We note that $\Pr[(L \cap S \cap Q) \mid D] = 1$, since given $D$, each of $L$, $S$, and $Q$ are certain events. So we obtain $$\Pr[D \mid (L \cap S \cap Q)] = \frac{(0.8)(0.7)(0.9)}{\Pr[D]^2}.$$ We also note this implies stricter bounds on $\Pr[D]$, namely $$\frac{3}{5} \sqrt{\frac{7}{5}} \le \Pr[D] \le \frac{\sqrt[3]{63}}{5}.$$ This is the best you can do with the given information. It implies that the unconditional probability that a randomly selected object is a duck is at least $0.70993$ and no more than $0.795811$, and the probability that an object that looks, swims, and quacks like a duck is actually a duck ranges from $0.795811$ to $1$, depending on the underlying prevalence of ducks.

However, the fact that $\Pr[D] = 0.7 \Pr[S]$ means that $\Pr[D] \le 0.7$, so we obtain a logical contradiction: it is not possible for $L, S, Q$ to be independent events. Therefore, one must have some kind of positive correlation between these, in which case the permissible range of $\Pr[D]$ expands. However, without a quantitative characterization of this correlation, the question without the independence assumption becomes even more under-specified than it already is.