Question:
Let $S$ be the strategy that it will start with $C$ and continue to do so until the opponent plays $D$ in the previous game. In this case, this strategy will play $C$ with probability 1/3 and $D$ with probability 2/3. Find the transition matrix when Player 1 uses $S$ and Player II uses $TFT$
Based on the explanation of $S$, we can assume $S$ a $TFT$ strategy. In this case, the possible gameplay could be (where C represents cooperate.
Player 1: C C C ...
Player 2: C C C ...
I know that $\left< TFT, TFT \right>$ is a SE when $\beta$ is large enough, or more specifically, when $$ \beta > (T-R)/(R-S) $$
where the bimatrix game can be represented as $$ \begin{pmatrix} R,R & S,T \\ T,S & P,P \end{pmatrix} $$
I am having trouble understanding what the "transition matrix" represents, any help?
Different Types of strategy
- All $D$, defect all times
- $PR$, Permanent Retaliation, cooperate until , if ever, opponent defects, then defect forever.
- $TFT$, Tit-for-Tat, cooperate first, then do your opponents previous move
- $AltDC$: alternating defect and cooperate, start with D and then alternatively playing C and D
Different types of strategy characteristics:
- nice - start cooperating and never first to defect
- retaliatory - it should reliably punish defection by its opponent
- forgiving - having punished defection, it should be willing to try to cooperate again
- clear - it's pattern of play should be consistent and easy to predict
According to the literature (e.g., here), the transition matrix describes the probability of playing strategy $j$ in this round if the opponent played strategy $i$ in the previous round. Since the prisoner's dilemma only has two pure strategies, $C$ and $D$, the matrix would be 2x2. Note that the transition matrix not only described on equilibrium path play, but all contingencies inherent in the repeated-game strategies.
Let $m_{ij}$ be the cell of the matrix with the probability that S plays $i$ in this round if TFT played $j$ last round. Then $$m_{CC}=1 \\ m_{DC}=0 \\ m_{CD}=1/3 \\ m_{DD}=2/3.$$
Note that (non-Markov) strategies that have a "longer memory" than just the previous round cannot be represented in such a simple 2x2 matrix.