5 friends have come up to me and asserted that "Fred is coming to visit tomorrow". The more people I hear it from, the more I believe it to be true. How do I model this probabilistically?
I think I want P(True | Friend 1 says so AND Friend 2 says so AND ...)
Assumptions:
- No collusion or influence between friends
- I can assign P(True | Friend X says so) from past evidence (e.g., how much I trust a friend's assertions). So I have precision data per friend, but no notion of recall.
- The prior probability of the assertion being true is extremely low. P(True)~0.0
Is bayesian inference the way to go? Something simpler? Is this underspecified somehow?
Update:
Thank you all for your help. I'll sleep on this. But I think I haven't fully conveyed what I'm struggling with. My point about the prior being low may have been misguided. Let me try asking it this way…
Jane comes to me and says "watermelons always have an even number of seeds". I have no basis for evaluating this assertion other than knowing that Jane is right about stuff 80% of the time. So it seems that at this point my belief in the watermelon assertion would be 0.8.
Now if Sarah (who is right 70% of the time) comes to me and makes the same watermelon assertion, I would think my belief should now be something higher than 0.8, without knowing anything else. It seems like I got some confirmation that should increase my belief. If I'm right, what should my belief in the watermelon assertion be now?
My gut says it should be something like: 1.0 - [(1.0 - 0.8) * (1.0 - 0.7)] = 0.94
Let's do this for just two friends. Call the event "Fred is going to visit tomorrow" $T$, the event friend 1 says so $A$, the event friend 2 says so $B$, and use the notation $\cap$ for "and."
Then by Bayes' theorem $P(T|A\text{ and }B)=P(A\cap B|T)P(T)/P(A\cap B)$. Now $P(A\cap B|T)=P(A|T\cap B)P(B|T)$, and by assumption $P(A|T\cap B)=P(A|T)$ and $P(A\cap B)=P(A)P(B)$, so $P(T|A\cap B)=P(A|T)P(B|T)P(T)/P(A)P(B)$. Now reverse Bayes' theorem: $P(A|T)=P(T|A)P(A)/P(T)$ and similarly for $B$, so $P(T|A\cap B)=P(T|A)P(T|B)P(A)P(B)/P(T)P(A)P(B)=P(T|A)P(T|B)/P(T)$.
More generally, the same computation shows $$P(T|A_1\cap A_2...\cap A_n)=P(T|A_1)P(T|A_2)...P(T|A_n)/(P(T))^{n-1}$$
From this formula we see quantitative support for Hagen's point that we can't simultaneously have your friends be expert predictors of implausible phenomena and make independent predictions. Indeed if $P(T)=\epsilon$ then the geometric mean of your $P(T|A_i)$s can be no more than $\epsilon^{(n-1)/n}$. You might protest that you only insisted on no collusion between friends, but probability doesn't differentiate (at least not directly) between different causes of a correlation: if the $P(T|A_i)$s are high, then the friends' opinions are correlated, and you can't update on each piece of information independently, regardless of whether the friends have ever met in their lives. How exactly you should update in case of dependent friends is, indeed, underspecified.