Building a Model of Probability Progression in Mastermind

84 Views Asked by At

Consider the famous game Mastermind. In the simplified, non-adversarial version I'll consider, each of $4$ ordered slots is uniformly randomly assigned a color (Red, Blue, Yellow, Green, White, or Black). Then the single player repeatedly guesses the assigned colors (order of arrangement matters), and receives a clue after each guess (as described by Rule #$3$ in the linked PDF). I will assume the player's strategy is also guessing uniformly randomly among all sequences which have not been ruled out by prior clues. I want to model how the probability of the player correctly guessing the color sequence within some fixed number of guesses updates in response to each guess. I will use the notation $W_n$ to mean "winning (correctly guessing the color sequence) within $n$ guesses," and $C_m$ to mean "the clue returned in response to the $m$-th guess."

The easiest case would be if the goal is to win within just $1$ guess. Then $\Pr(W_1) = \frac{1}{1296}$.

If the goal is to win within $2$ guesses, then the probability of winning at the start of the game would be $\Pr(W_2)$, and after the player guesses once and gets a clue it would become $\Pr(W_2\ |\ C_1)$. I can find at least two approaches to relate $\Pr(W_2\ |\ C_1)$ back to $\Pr(W_2)$. The first is to apply Bayes Theorem to obtain $\boldsymbol{\color{blueviolet}{\Pr(W_2\ |\ C_1)}} = \frac{\boldsymbol{\color{blueviolet}{\Pr(W_2)}}\Pr(C_1\ |\ W_2)}{\Pr(C_1)}$, and the second is to apply the conditional probability formula and then the inclusion-exclusion formula in the numerator to obtain $\boldsymbol{\color{blueviolet}{\Pr(W_2\ |\ C_1)}} = \frac{\boldsymbol{\color{blueviolet}{\Pr(W_2)}}\ +\ \Pr(C_1)\ -\ \Pr(W_2\ \cup\ C_1)}{\Pr(C_1)}$. The first approach seems like a more promising road to insight, since Bayes' Theorem is advertised as the mathematical description of belief updating.

However, if the goal is to win within $3$ guesses, then the relations get trickier. The game progression is now $\Pr(W_3) \implies \Pr(W_3\ |\ C_1) \implies \Pr(W_3\ |\ (C_1\ \cap\ C_2))$. The first two probabilities are amenable to both approaches in the above paragraph, but the only way I can find to relate $\Pr(W_3\ |\ (C_1\ \cap\ C_2))$ back to $\Pr(W_3\ |\ C_1)$ is analogous to the second approach, and produces $\boldsymbol{\color{blueviolet}{\Pr(W_3\ |\ (C_1\ \cap\ C_2))}} = \frac{\boldsymbol{\color{blueviolet}{\Pr(W_3\ |\ C_1)}}\Pr(C_1)\ +\ \Pr(W_3\ |\ C_2)\Pr(C_2)\ -\ \Pr(W_3\ |\ (C_1\ \cup\ C_2))\Pr(C_1\ \cup\ C_2)}{\Pr(C_1)\ +\ \Pr(C_2)\ -\ \Pr(C_1\ \cup\ C_2)}$. The first approach (Bayes' Theorem) produces $\boldsymbol{\color{blueviolet}{\Pr(W_3\ |\ (C_1\ \cap\ C_2))}} = \frac{\Pr(W_3)\Pr((C_1\ \cap\ C_2)\ |\ W_3)}{\Pr(C_1\ \cap\ C_2)}$. There is no $\Pr(W_3\ |\ C_1)$ to be found here. The strange thing is that the prior probability in this application of Bayes' Theorem reflects the probability of winning at the start of the game, $\Pr(W_3)$, instead of the probability of winning at the prior guess, $\Pr(W_3\ |\ C_1)$. The belief updating interpretation of this appears to envision a process of relearning all known facts from scratch each time beliefs are updated, instead of incorporating just the most recent data.

I have also tried converting all probabilities to odds using $\Pr(A) = \frac{O(A)}{O(A)\ +\ 1}$, as inspired by this video, and (separately) to information using $\Pr(A) = .5^{I(A)}$, as inspired by this video, at various stages of the computations, but the algebra becomes unwieldy and shows little sign of simplification. To further clarify what I'm searching for, my understanding of belief updating is that odds are supposed to "update" rather cleanly as multiplication (via a Bayes' Factor), information (or perhaps entropy) is supposed to update rather cleanly as addition, and probabilities are supposed to update as something else (possibly less cleanly). My goal is to really internalize applying and switching between these three frameworks in this type of problem in which the incoming data signals (clues) are not, in general, independent, but every other aspect of the problem is made as modeling-friendly as possible. I may even apply this to a simpler version of Mastermind, i.e., with $4$ colors and $3$ slots, if brute-forcing the outcome space for numeric examples would be insightful.

For some known constant goal (maximum number of allowed guesses), I wonder if it is possible to establish a recurrence relation to express the probability of winning after the $n$-th guess as a function of the probability of winning after the $n\ -\ 1$-th guess. Similarly for the odds and for the information (or entropy) at each guess (in fact, I would expect to be able to transform the three recurrence relations into each other). While this is not my primary aim, I am also suspicious that some relation between $W_1$, $W_2$, $W_3$, etc., may emerge when the goal is viewed instead as a variable.

How should I manipulate the expressions I'm working with to make these connections clear?