Discount factor and deviating from strategy - Game Theory

414 Views Asked by At

I have an exercise in Steven Tadelis Game theory Introduction book (10.2) :

Grim Trigger: Consider the infinitely repeated game with discount factor $δ < 1$ of the following variant of the Prisoner’s Dilemma:

enter image description here

a) For which values of the discount factor δ can the players support the pair of actions (M, C) played in every period?

My attempt is:

First, I find the Nash equilibrium of the game (so we know where the player would deviate if not following the proposed strategy):

For the row player we see that Row T and M are dominated by B, so we leave row B and delete the former 2 rows. Then for the column player, we see that the columns L and C are dominated by R, so we leave R and delete the former 2 rows. So our Nash Equilibrium is $(0,0)$.

By a definition in my textbook:

enter image description here

So the expected value of staying with the strategy $(M,c)=(4,4)$ is :

$4+\delta 4+\delta^2 4+....=4+4\sum^{\infty}_{t=1}\delta^{t-1}=4+4\delta/(1-\delta)$

Now, if the players deviate to $(0,0)$, then they would get $5$ insted of $4$ in the immediate stafe of the deviation, followed by his continuation payoff:

$v_i'=5+0\delta+0\delta^2_+...=5$

For the player to stay and not deviate, the payoff for the first strategy should be higher than the latter strategy (where they deviate):

$$4+4\delta/(1-\delta)\geq 5 \Leftrightarrow \delta \geq 1/5$$

So, for $\delta \geq 1/5$, the players would not deviate.

Would this reasoning/solution be correct?

1

There are 1 best solutions below

0
On BEST ANSWER

No.

First, deviations from NE are unilateral, so only one player deviates (they don't deviate together to $(0,0)$).

Second, as the term "grim trigger" suggests, the punishments in case of deviations should be "forever".

For example, suppose the players play $(M,C)$ and receive a payoff of $(4,4)$. Player 1 can deviate to $B$ and gain $1$ unit of utility ($5$ instead of $4$). This provokes the punishment so from the next period onward, Player 2 will forever punish him (by using his minimax strategy and lowering the stage payoff of Player 1 as much as possible, which is less than $4$). If $\delta$ is close to $0$ - no problem, Player 1 is impatient. The gain today of $1$ is much better than any future lose due to the punishment. At some $\delta$ they are equal and above it - the punishment reduces the payoff much more than the possible gain from the deviation, so Player 1 continues to play $M$. Find this critical $\delta$.

Repeat it to Player 2 and the max of these two $\delta$s is the required one.