The following question comes from the probability section of the Titan Test*.
* I will avoid the debate around whether this test accurately measures what it aims to, nor whether such aims are meaningful. The test does seem to contain some interesting mathematical puzzles, and that's what counts!
Suppose you are truthfully told that ten marbles were inserted into a box, all of them identical except that their colors were determined by the toss of an unbiased coin. When heads came up, a white marble was inserted, and when tails came up, a black one. You reach into the box, draw out a marble, inspect its color, then return it to the box . You shake the box to mix the marbles randomly, and then reach in and again select a marble at random. If you inspect ten marbles in succession in this manner and all turn out to be white, what is the probability [likelihood] to the nearest whole percent that all ten marbles in the box are white?
I am not sure how to approach this through the framework of frequentist probability. It looks like a Bayesian approach is warranted.
The crucial starting point is - what prior distribution should be used?
My first instinct would be base the prior on a binomial distribution, as we are told that the box was filled through a Bernoulli process. Articles such as http://www.amstat.org/publications/jse/v12n2/zhu.pdf address those cases where "there is no strong prior opinion on what p is" - this would not appear to apply here.
So define
$b_n = {\displaystyle{\binom{10}{n}}}\dfrac{1}{2^{10}} \tag{1}$
for the prior distribution of black marbles.
Let:
- $n$ be the number of black marbles in the box
- $A_{0}$ be event that 10 white marbles were drawn
- $A_{0,n}$ be event that 10 white marbles were drawn and the box contained $n$ black marbles
- $B_n$ be the prior event that the box contained $n$ black marbles
Then
$P(A_0) = \sum\limits_{n=0}^{10}{P(A_{0,n})} \tag{2}$
and by Bayes' Theorem
$P(A_{0,n}) = P(A_0 \cap B_n) = P(B_n)P(A_0|B_n) \tag{3}$
where
$P(B_n) = b_n = {\displaystyle{\binom{10}{n}}}\dfrac{1}{2^{10}} \tag{4}$ and $P(A_0|B_n) = \Big(1-\dfrac{n}{10}\Big)^{10} \tag{5}$
The answer we are looking for is $\dfrac{P(A_{0,0})}{P(A_{0})}$. This can be computed using (2) to (5).
Does the approach look correct?
If so, how could I get rid of the nasty normalisation in the last step? Notice that I did not (and could not) claim that $p(A_0) = 1$.
We will consider the general case of $m$ marbles placed in the bag, $n$ independent draws with replacement, and the observed number of white marbles $x$; and substitute $m = n = x = 10$ to obtain the desired probability.
Let $X$ be the random variable that counts the number of observed white marbles among $n$ independent draws. Because the marbles are drawn with replacement after the distribution of marbles is fixed (but unknown), the conditional random variable $X \mid p$ follows a binomial distribution with parameters $n$ and probability of drawing white $p$, where $p$ is itself a random variable.
Since $p = H/m$ where $H$ is the number of times a head was flipped from a fair coin out of $m$ trials and therefore the number of white marbles placed into the bag. Thus $H \sim \operatorname{Binomial}(m,0.5)$ and $p$ is simply a scaling transformation of $H$. This now completes the specification of the hierarchical model.
The key is to compute the unconditional probability $$\Pr[X = x].$$ By the law of total probability, $$\begin{align*} Pr[X = x] &= \sum_{h=0}^m \Pr[X = x \mid p = h/m]\Pr[H = h] \\ &= \sum_{h=0}^m \binom{n}{x} (h/m)^x (1-h/m)^{n-x} \binom{m}{h} 2^{-m} \\ &= 2^{-m} \binom{n}{x} \sum_{h=0}^m \binom{m}{h} (h/m)^x (1-h/m)^{n-x}.\end{align*}$$ Then by Bayes' theorem, $$\Pr[H = h \mid X = x] = \frac{\Pr[X = x \mid p = h/m]\Pr[H = h]}{\Pr[X = x]},$$ which enables us to compute the posterior distribution of the number of white marbles in the bag, given the observed number of white marbles drawn $x$. For the specific case $m = n = x = 10$ we obtain $$\Pr[X = 10] = 2^{-10} \frac{111304237}{7812500},$$ and for $h = 10$, the numerator is simply $\Pr[H = 10] = 2^{-10}$; hence the desired probability is $$\Pr[H = 10 \mid X = 10] = \frac{7812500}{111304237} \approx 0.0701905.$$
This question clearly invites even more generalizations, such as: what if the coin is biased? Conversely, given other parameters in the model, what is the minimum number of draws required to obtain a posterior probability that the bag contains a certain number of white marbles with probability exceeding some critical threshold $\pi$? What is the posterior expectation and variance of the number of white marbles in the bag given the sample?