Game of draughts, expected value of first move advantage, part 2

121 Views Asked by At

This is a follow up to my question here: Game of draughts, expected value of first move advantage

Here's a question from my probability textbook:

$A$ and $B$ play at draughts and bet $\$1$ even on every game, the odds are $\mu : 1$ in favor of the player who has the first move. If it be agreed that the winner of each game have the first move in the next, show that the advantage of having the first move in the first game is $\$\mu-1$.

Here's the answer in the back of my book:

It is assumed that they go on playing indefinitely.

Let $E$ be the expectation of the first player; then since what one gains the other loses, the second player's expectation is $-E$.

At the first game the first player either wins $\$1$ and remains first player or he loses $\$1$ and becomes second player. And the chances of these two events are as $\mu:1$, so$$E = {{\mu(1 + E) + (-1 - E)}\over{\mu + 1}} = {{\mu - 1}\over{\mu + 1}}(1 + E);$$whence$$E = {{\mu - 1}\over2}, \quad -E = -{{\mu - 1}\over2};$$and the first player's advantage is $\mu - 1$ dollars.

I understand this first part which I typed out above. Here's the second part of the answer given in the back of my book, assuming a different interpretation of the question which assumes playing only a limited number of games:

But if the number of games be limited, let $E_n$ be the first player's expectation when $n$ more games are to be played. Then$$E_0 = 0 \text{ and } E_n = {{\mu - 1}\over{\mu + 1}}(1 + E_{n-1}) = k(1 + E_{n-1}).$$Whence we find$$E_1 = k, \quad E_2 = k + k^2, \quad E_3 = k + k^2 + k^3,$$and so on;$$E_n = {k\over{1 - k}}(1 - k^n) = {{\mu-1}\over2}\left(1 - \left({{\mu-1}\over{\mu+1}}\right)^n\right).$$And the first player's advantage is the double of this$$ = (\mu - 1)\left(1 - \left(1 - {2\over{\mu + 1}}\right)^n\right) = (\mu - 1)\left({{2n}\over{\mu + 1}} - \text{etc.}\right).$$

I don't understand how the book came up with looking at $E_0 = 0$, $E_n = {{\mu - 1}\over{\mu + 1}}(1 + E_{n-1})$ as the correct expression(s) to set up. Can anybody explain this in depth, why this is the correct setup? Once you assume that of course then solving the problem is straightforward, I understand everything that follows.

1

There are 1 best solutions below

0
On BEST ANSWER

The technique we used are the same as the first part - using backward recursion, and the recursion is due to the first step analysis - law of total probability.

Note that the definition of the $E_n$ is the expectation for the first player (gain) when $n$ more games to be played.

WLOG let's call be the first player of the game, when there is still $n$ games to go, to be player $A$.

Denote $$ R_i = \begin{cases} 1 & \text{ if } A \text{ is the first player} \\ 2 & \text{ if } A \text{ is the second player} \end{cases}, i = n, n-1, \ldots, 1$$

Define the payoff of $A$ when there are $i$ games to go to be $$ X_i = \begin{cases} 1 & \text{ if } A \text{ wins} \\ -1 & \text{ if } A \text{ loses} \end{cases}, i = n, n-1, \ldots, 1 $$

such that $$\Pr\{X_i = 1|R_i = 1\} = \frac {\mu} {1 + \mu}, \Pr\{X_i = 1|R_i = -1\} = \frac {1} {1 + \mu}$$ following the definition of odds.

Define the gain of player $A$ to be the sum of those payoffs of $i$ games not played yet:

$$ G_i = \sum_{k=1}^i X_k, i = n, n - 1, \ldots, 1$$

and in such case $E_n = E[G_n | R_n = 1]$

Therefore when they played all games and no more games are played, the expected gain $E_0 = 0$, as no more games to be played. The final wealth of the two players are fixed already. This is the boundary conditions of the recursion.

Using the first step analysis - law of total probability,

$$ \begin{align} E_n &= E[G_n|R_n = 1] \\ &= E[G_n|R_n = 1, X_n = 1]\Pr\{X_n = 1\} + E[G_n|R_n = 1, X_n = -1]\Pr\{X_n = -1\} \\ &= E[G_{n-1}+1|R_{n-1} = 1] \frac {\mu} {\mu + 1} + E[G_{n-1}-1|R_{n-1} = 2] \frac {1} {\mu + 1} \\ &= (E_{n-1} + 1)\frac {\mu} {\mu + 1} + (-E_{n-1}-1) \frac {1} {\mu + 1} \\ &= \frac {\mu - 1} {\mu + 1}(E_{n-1} + 1)\end{align} $$ Here the second line follows from the law of total probability; The third line will follows from $G_n = X_n + G_{n-1}$ and note that the distribution of $G_{n-1}$ solely depends on the playing order of $R_{n-1}$ only (Markov property), which carry the same information as the previous game payoff $X_{n-1}$.

When we play this game in long run, those $E_n$ converge and if we pass the limit on both sides, we obtain the equation of the long run gain as in the first part of the answer.

Due to the Markov property, the long run proportion of time that a certain player plays first converge for both players, regardless of the initial playing order. The long run overall gain $E$ does not further increase indefinitely but converge to the answer of the first part.