Combined Conditional Probability Problem

64 Views Asked by At

My other question was closed due to not meeting guidelines so I'll be careful this time :(

Given: a person typing a document has an error rate of 3 words per 1000 words.

Question: each document has 1000 words. The error rate of each document is independent of one another. What is the probability that 10 or more documents have to be searched in order to find 3 documents containing at least 8 errors.

Solution: I think this is one of these "given a, solve for b" problems. I'm given that 3 documents contain 8 errors (A), solve for the probability that 10 or more documents have to be searched (B).

$P(A|B)=\frac{P(A \cap B)}{P(B)}$

$P(A)$ is easy to calculate (I think?). It's just the the Binomial Distribution of where $n=3$.

$P(B)$ seems like a Negative Binomial Distribution problem? I'm not sure about this one and I kinda suck at calculating Negative Binomials. All in all, I need some help here.

1

There are 1 best solutions below

2
On

I take the liberty of choosing a somewhat arbitrary way of computing $A$ (from scratch), and then using that computation to compute $B$, where $A$ and $B$ are detailed below.


Let $A$ denote the probability that a specific document contains $8$ or more errors.

Let $B$ denote the probability that in exactly $9$ documents, the number of those documents with $8$ or more errors will be some element in $\{0,1,2\}$.

Then, the desired (final) probability is $B$.
Therefore, the problem has been reduced to :

  • Computing $A$ from the constraints of the problem, $~~~~$ and
  • Computing $B$ from $A$.

Computation of $A$

I will presume (perhaps wrongly) that that the probability that a specific word is in error within one document is
$\displaystyle p = \frac{3}{1000}$.

Such an assumption is at least consistent with the problem's constraints, given linearity of expectation.

Based on my presumption, the probability that the word is not in error is
$q = (1 - p).$

I am also going to assume (perhaps wrongly) that the distribution of errors within a document follows a Binomial Distribution,
specifically $\displaystyle \binom{n}{k}p^kq^{(n-k)}.$

Based on this assumed framework,

$$A = ~1 ~- ~\left[ \sum_{k=0}^7 \binom{1000}{k}p^kq^{1000 - k}\right].$$


Computation of $B$

Using an approach very similar to that in the previous section,
let $R = (1 - A)$.

Then

$$B = \sum_{k=0}^2 \binom{9}{k}A^k R^{9-k}.$$


Addendum
Responding to the comment question of Krrr:

...But what would $P(A \cap B)$ be? If they're independent events I can simply multiply the probabilities.

Describe a document as bad, rather than good, if the document contains at least 8 errors. I am using event B in the same way that you are. That is, event B represents the event that there are less than 3 bad documents among the first 9 documents examined. However, I am using event A in a manner totally different from how you are using it. I am using event A to represent the event that one specific document is bad.

You attempted to describe an event A to signify that there are 3 bad documents. This is an unworkable definition, because you are not specifying how many documents are being searched to uncover $3$ bad documents. Does event A signify that the first 3 documents examined are bad? Alternatively, does event A signify that at least 3 documents out of 1000 documents examined are bad?

There is no point in defining event A to represent that there are 3 or more bad documents among the first 9 documents examined, because that is simply the complement of the already defined event B.

So, you took a wrong turn in your analysis, using an unworkable definition for event A, and then took another wrong turn thinking that Bayes Theorem should be involved in the answer.

Assuming that this problem was assigned to you from a book or class, the reason that you took a wrong turn is that the book or class inadequately trained you before assigning such a problem. Developed intuition is disproportionately important in math problems that involve Probability or Combinatorics. The only way that your intuition can develop is to be exposed to worked examples or (much) easier solved problems, so that you can get the feel of which tools (e.g. Binomial Distribution, Bayes Theorem, ...) to use against which problems.

So, if this problem is from a book or class, then I advise reviewing the previously solved problems, worked examples, or theorems that led up to this problem.

If this problem was not assigned to you from a book or class, but was instead simply a problem that you saw on your own, and decided to attack, then you jumped into the deep water before learning how to swim. This is actually a complicated problem to solve. I had to split my analysis into 3 parts, and then attack the problem one part at a time.

Assuming that this problem was not assigned to you from a book or class, a better approach for you is to try to find the right book (for you) on Probability Theory, and open it to page 1, and begin learning Probability Theory (and begin developing your intuition) from scratch.

Beyond that, in answer to the question that you posed

  • Your definition of event A seems unworkable for the problem.

  • For my definition of event A, the idea of considering the event $(A\cap B)$ or computing the probability of the event $(A\cap B)$ doesn't make sense, per my definition of event A.

I welcome any other questions that you might have.