The problem I am referring to is as follows:
A crime has been committed by a solitary individual, who left some DNA at the scene of the crime. Forensic scientists who studied the recovered DNA noted that only five strands could be identified and that each innocent person, independently, would have a probability of $10^{-5}$ of having his or her DNA match on all five strands. The district attorney supposes that the perpetrator of the crime could be any of the one million residents of the town. Ten thousand of these residents have been released from prison within the past 10 years; consequently, a sample of their DNA is on file. Before any checking of the DNA file, the district attorney thinks that each of the 10,000 ex-criminals has probability α of being guilty of the new crime, while each of the remaining 990,000 residents has probability β, where α = cβ. (That is, the district attorney supposes that each recently released convict is c times as likely to be the crime’s perpetrator as is each town member who is not a recently released convict.) When the DNA that is analyzed is compared against the database of the ten thousand ex-convicts, it turns out that A. J. Jones is the only one whose DNA matches the profile. Assuming that the district attorney’s estimate of the relationship between α and β is accurate, what is the probability that A. J. is guilty?
Problematic part of author's solution:
The way Ross tackles this problem is by evaluating $P(G|M)$, where $G$ is the event that A.J. is guilty and $M$ is the event that A.J. is the only one of the 10,000 on file to have a match.
$$P(G|M) = \frac{P(GM)}{P(M)} = \frac{P(G)P(M|G)}{P(M|G)P(G)+P(M|G^{c})P(G^{c})}$$
To find $P(M|G^{c})$ in the denominator of the above equation, he states:
If A.J. is innocent, then in order for him to be the only match, his DNA must match (which will occur with probability $10^{-5}$), all others in the database must be innocent, and none of these others can have a match. Now, given that A.J. is innocent, the conditional probability that all others in the database are also innocent is
$$P(all\ others\ innocent|AJ\ innocent) = \frac{P(all\ in\ database\ innocent)}{P(AJ\ innocent)} = \frac{1-10,000\alpha}{1-\alpha}$$
Also, the conditional probability, given their innocence, that none of the others in the database will have a match is $(1-10^{-5})^{9999}$. Therefore,
$$P(M|G^{c}) = 10^{-5}\Bigl(\frac{1-10,000\alpha}{1-\alpha}\Bigr)(1-10^{-5})^{9999}$$
Despite agreeing with the author on all the conditions described for $M|G^{c}$ to occur, I can not understand how he arrived at the final equation. What formula allows you to simply multiply the probabilities of three events to arrive at the conditional probability? Even if we assume that the conditional probability is the same as the probability of the intersection of these three events, we can't possibly multiply them without showing independence of events.
Maybe there are some intermediate steps where simplification has been done using a formula and the final form involves multiplying the above three probabilities. In that case could someone show me those steps?
PS: apologies for the long post.
Whether you will agree or not with my justification for Ross below, I need to make a disclaimer.
I've read and learned a lot from the textbooks by Sheldon Ross. Having said that, it is my unwavering opinion that he often sacrifices rigor, and unfortunately often clarity, for what he believes is intuitive and, ironically, good pedagogy.
Okay let's get down to business.
The are actually only two events (with one of them being a combo of sub-events).
Denote $E_1$ for "all 9999 other ex-cons innocent (given AJ innocent)" and $E_2$ for "only AJ is matched ($\color{magenta}{\text{given all ex-con are innocent}}$)" such that $P(M|G^c) = P(E_2 \cap E_1)$.
Even though I personally think this is problematic, you have agreed that $$P(E_1) = \frac{1-10,000\alpha}{1-\alpha}$$ Anyway, if you're fine with it then I shall not digress.
As for $E_2$, according to the statement near the beginning: "...each $\color{magenta}{\text{innocent}}$ person, independently, would have a probability of $10^{−5}$ of ... DNA match". The boldface for the independence of the sub-events came from yourself. This gives $P(E_2) = 10^{-5} \cdot (1-10^{-5})^{9999}$.
Now, $P(M|G^c) = P(E_2 \cap E_1) = P(E_2 ~|~ E_1) \cdot P(E_1)$
At the same time, $P(E_2 ~|~ E_1) = P(E_2)$ just by construct (not of the math, but of the English statement describing the event) which I highlighted in magenta.
In other words, by the semantic meaning of the definition of the event $E_2$, we automatically have $P(E_2 ~|~ E_1) \cdot P(E_1) = P(E_2) \cdot P(E_1)$. This technically also "proves" that $E_2$ and $E_1$ are independent, but I personally think viewing it as the conditional-multiplication $P(E_2 ~|~ E_1) \cdot P(E_1)$ is semantically more natural.