Chance of three same consecutive answers in A/B/C/D test

222 Views Asked by At

I am facing the following problem:

What is the probability that there will be at least three same consecutive right answers in a $n$-question A/B/C/D test?

I have started with $10$ questions while trying to solve this, so there are $4^{10}$ possible sequences. Now our three same results can begin at the first through the eighth place ($n-2$) because at the ninth place it would be $9-10-11$. For each of these possibilities there can be $4^{10-3} = 4^7$ combinations of $7$ remaining positions. So that's $8 \cdot 4^7$, and now we have to multiply by $4$ because there can be three or more As, Bs, Cs or Ds. That gives us $$\frac{8 \cdot 4^7 \cdot 4}{4^{10}}$$ which simplifies to $8/4^2 = 1/2 = 50\%$.

I have checked the result practically with Python code with $100,000$ attempts, and I got about $38514/100000 = 0.38514 = 38.514\%$, which is not even remotely close to my result.

Do you have any idea where I went wrong?

4

There are 4 best solutions below

1
On BEST ANSWER

I think a better approach to this problem is to count the number of strings that do not have three consecutive answers the same. Rather than trying to compute them directly, which will involve us in double-counting problems of the kind you encountered, let's try to develop a recurrence relation. Call a string of characters from {A,B,C,D} "good" if no three consecutive characters are the same, and let $a_n$ be the number of good strings with $n$ characters. If we know $a_n$ can we compute $a_{n+1}?$

Since a the first $n$ characters of a good string length $n+1$ must form a good string of length $n$, we can just add another character to a good string of length $n$ to a get a good string of length $n+1$ provided the last two characters are different. So, let $s_n$ be the number of good strings of length $n$ that end in a single character (that is, the last two character are not the same), and let $d_n$ be the number of good strings of length $n$ that end in a double character. Clearly, $$a_n=s_n+d_n\tag 1$$

Now, to get a good string of length $n+1$ without a doubled character from a good string of length $n$, we simply append any character different from the last one: $$s_n=3a_{n-1}=3(s_{n-1}+d_{n-1})\tag 2$$ To get a doubled string of length $n$ we must add a character different from the last one to a single string of length $n-1$ so that $$ d_n=s_{n-1}\implies s_n=3a_{n-1}=3(s_{n-1}+s_{n-2})\tag 3$$

We know that $s_1=4$ and $s_2=12,$ so we can repeatedly apply $(3)$ to compute the values of $s_n$. I got $s_9=141264, s_{10}=535572,$ from which I computed the probability of three consecutive identical answers as $.3545188903808594\approx 35.45$%

I don't guarantee the correctness of these calculations, as I have not checked them carefully, but they are certainly a lot closer to your experimental results.

I meant to mention that it is possible to solve the recurrence for $s_n$ in closed form, but it's hardly worth the effort for this problem.

0
On

You are counting tests where this happens multiple times. For example, a test where all the answers are $A$ gets counted $8$ times.

1
On

As others have pointed out, you are counting tests multiple times. One way to approach this problem is the following. Imagine the first answer being $x \in \{A, B, C, D\}$. Then either the second answer equals $y \in \{A, B, C, D\} \setminus x$, or equals $x$ as well. In the former case, the probability that the third answer does not result in a sequence of three times $x$ equals $1$, while in the latter case, this probability equals $3/4$. We thus get, for a sequence of length $n$, that the probability $P(n)$ of the sequence not containing three same answers in a row equals:

$$P(n) = \frac{3}{4} \cdot 1 \cdot P(n-1) + \frac{1}{4} \cdot \frac{3}{4} \cdot P(n-2) = \frac{3}{4} P(n-1) + \frac{3}{16} P(n-2)$$

We know that $P(1) = P(2) = 1$, and we thus get:

$$P(3) = 0.9375$$ $$P(4) \approx 0.8906$$ $$P(5) \approx 0.8438$$ $$P(6) \approx 0.7998$$ $$P(7) \approx 0.7581$$ $$P(8) \approx 0.7185$$ $$P(9) \approx 0.6810$$ $$P(10) \approx 0.6455$$

The probability that there will be at least three consecutive right answers in a test containing ten questions with four possible answers thus equals:

$$1 - 0.6455 = 0.3545$$

Edit: The following Python code indeed results in a value of $0.3545$:

import itertools

c = 0
for test in itertools.product({1, 2, 3, 4}, repeat = 10):
    for i in range(8):
        if test[i] == test[i + 1] == test[i + 2]:
            c += 1
            break
print(round(c/(4 ** 10), 4))
0
On

Alternatively, the question can be formulated as a problem of computing the probability of absorption in a Markov chain.

We can assign a state $i \in \{1, 2, 3\}$ to any sequence of characters from $\{A, C, B, D\}$ , which represents the fact that the last $i$ consecutive characters are the same. Once the chain reaches the state $3$, the sequence is classified as 'wrong' and the chain remains in this state. Such a chain has the transition matrix $\mathbf{P} = \{p_{ij}\}$, where $p_{ij}$ is the transition probability from $i$ to $j$,

$$ \mathbf{P} = \begin{pmatrix} 3/4 & 1/4 & 0 \\ 3/4 & 0 & 1/4 \\ 0 & 0 & 1 \\ \end{pmatrix} $$

and the initial distribution $p_1^T = (1, 0 , 0)$. Then $$p_{10}^T = p_1^T \, \mathbf{P}^{9} \approx (0.5108, 0.1347, 0.3545)^T. $$