There is a problem I have to solve that I don't really understand the math behind it. Here's the problem:
A kanji has a reading and a meaning. A review session where you review $N$ kanjis has $2N$ questions (one for the reading and one for the meaning). There is a counter that tracks the number of kanjis for which you've answered both meaning and reading questions, but you can't know how many questions you have left. You are in the middle of a review, and the counter says $K$. You don't remember how many questions you've answered so far, but you'd like to know how many remain. What's the expected number of questions left in this review session, assuming that all questions come in a uniformly random order?
The problem shows some sample input/outputs:
- Input: $N = 1$, $K = 0$. Output: 1.5
This means there is $1$ kanji, so we have 4 options:
- You didn't solve any fo the kanji's quesions (2 questions left)
- You only solved the reading question (1 question left)
- You only solved the meaning question (1 question left)
- You solved both questions (0 questions left)
I don't consider the $4^{th}$ case because if we solved both questions of the kanji the counter would be $1$. So by this logic the probability of having $2$ questions left is $1/3$ and the probability of having $1$ question left is $2/3$. The weighted average would be: \begin{equation} 2 \ \frac{1}{3} + 1 \ \frac{2}{3} = \frac{4}{3} \end{equation} However, the explanation of the sample says "Having 1 kanji and a counter of 0 means either zero or one questions have been asked so far which means there could be 2 or 1 questions left, this gives us a expected value of:" \begin{equation} 2 \ \frac{1}{2} + 1 \ \frac{1}{2} = \frac{3}{2} \end{equation} I don't know why the weights are $1/2$ and not the ones I used. The probability of having $1$ left is not the same as the probability of having $2$ left.
- Input: $N = 2$, $K = 0$. Output: 3.125
- Input: $N = 2$, $K = 1$. Output: 1.25
I've been thinking about it for hours and I just want to know where I'm wrong. Thank you!
It's equally likely that you've answered 0 or 1 questions. Let's call these probabilities $p_0$ and $p_1$. We know $p_0 = p_1 = \frac{1}{2}$
If you've answered 1 question, it's equally likely you've answered a reading question or a meaning question. Both reading and meaning questions are equally likely. Let's call those probabilities $p_r$ and $p_m$. We know $p_r = p_m = \frac{1}{2}$
So, your three options are:
I don't think the distinction between the two types of questions actually matters. Each remaining kanji has either 1 or 2 remaining questions.
Now, for the case of $N=2$, things get a bit more interesting. You have four questions. I stand by my assertion that we don't really care which are reading and which are meaning questions; each kanji has only 2 questions, and it doesn't matter which order they're asked.
Let's say our set of questions is $\{ a, a, b, b \}$. What's the likelihood $K=0$ after $n < 2N$ questions?
So how do we turn that around so that given $K=0$, we can determine the probability for each possibility $n$?
The three possibilities, $n \in \left\{ 0, 1, 2 \right\}$, have relative likelihood weights of $\left\{1, 1, \frac{2}{3}\right\}.$ If we scale those so that they add up to 1 and act as probabilities, then they become $\left\{\frac{3}{8}, \frac{3}{8}, \frac{1}{4}\right\}.$
So, our expected value for $n$ is $0\cdot \frac{3}{8} + 1 \cdot\frac{3}{8} + 2\cdot\frac{1}{4} = \frac{7}{8}.$ The expected value for remaining questions, therefore, is $4 - \frac{7}{8} = 3\frac{1}{8} = 3.125$
You can apply similar reasoning to the $K=1$ case.
This problem revolves heavily around conditional probability.
(Note: I also have posted this answer to Quora.)