Equilibrium paths of Nash equilibria of finitely often repeated prisoner's dilemma

Question

Equilibrium paths of Nash equilibria of finitely often repeated prisoner's dilemma

161 Views Asked by Bumbble Comm At 14 May 2026 - 5:17

The specific payoff matrix is not so relevant, but consider a prisoner's dilemma with payoff matrix $$ \begin{matrix} & \text{C} & \text{D} \\ \text{C} & 0,0 & -2,1 \\ \text{D} & 1,-2 & -1,-1 \\ \end{matrix} $$ and the following claim:

Claim: In the finitely often repeated prisoner's dilemma (without discounting), only $(D,D)$ is played on the equilibrium path of any Nash Equilibrium (NE).

Note that this claim is concerned with any NE, not necessarily SPNE!

Attempted proof: By induction from the last period.

Consider any Nash equilibrium. Suppose there is a player, who on the equilibrium path of that equilibrium plays $C$ in the last period. Then that player would have a profitable deviation in playing $D$ instead, which is a contradiction. Therefore $(D,D)$ must be played on the equilibrium path.

Now consider a history $h$ on the equilibrium path such that for all longer histories $h'$ on the equilibrium path, both players prescribe $D$ as their action at $h'$. Suppose one player plays $C$ after $h$. Then, since only $(D,D)$ is played on the equilibrium path after $h$, that player has a profitable deviation in playing $D$ instead.

Question: Is this proof correct? I'm mostly unsure about the last sentence. How do I know that I stay on the equilibrium path while changing my action?

Thank you!

Original Q&A

There are 2 best solutions below

Bumbble Comm On 15 Sep 2017 - 9:42

Start in the last period, at time $T$. In the last period, $D$ is a dominant strategy regardless of the history of the game. So the subgame starting at $T$ has a dominant strategy equilibrium: $(D,D)$.

Then move to stage $T−1$. By backward induction, we know that at $T$, no matter what, the play will be $(D,D)$. Then given this, the subgame starting at $T−1$ (again regardless of history) also has a dominant strategy equilibrium.

With this argument, we have that there exists a unique subgame perfect Nash equilibrium $(D,D)$ at each date.

**Bumbble Comm** · Accepted Answer

I found an answer to my question in Proposition 155.1 of A Course in Game Theory by Osborne and Rubinstein. Their result is more general and says that

If the payoff profile in every Nash equilibrium of the strategic game $G$ is the profile $(v_i)$ of minmax payoffs in $G$ then for any value of $T$ the outcome $(a^1 ,\ldots,a^T)$ of every Nash equilibrium of the $T$-period repeated game of $G$ has the property that $a^t$ is a Nash equilibrium of $G$ for all $t = 1,\ldots,T$.

Here's the proof for my version of the question.

Suppose the game $G$ with payoff matrix given in my question is $T$ times repeated. Let $s$ be a NE of $G^T$ and let $(a^1,\ldots,a^T)$ be its outcome path.

Let $t\in\{1,\ldots,T\}$ be maximal such that $a^t \neq (D,D)$. Suppose that player $i$ plays $U$ after $(a^1,\ldots, a^{t-1})$. Consider a modification $\tilde s_i$ of $s_i$ in which $i$ plays $D$ after $(a^1,\ldots, a^{t-1})$ and after any longer history he plays something which guarantees him at least his min-max value for the period following that history (namely $-1$). Then $i$'s payoff from $\tilde s_i$ after $(a^1,\ldots, a^{t-1})$ is bounded from below by $$ 1 + (-1)(T-t), $$ which is greater than $(-1 + (-1)(T-t))$, the payoff from $s_i$ after $(a^1,\ldots, a^{t-1})$. Since the payoffs from $s_i$ and $\tilde s_i$ in all periods before $(a^1,\ldots, a^{t-1})$ are equal, this means that $\tilde s_i$ is a profitable deviation.

Equilibrium paths of Nash equilibria of finitely often repeated prisoner's dilemma

There are 2 best solutions below

Related Questions in PROOF-VERIFICATION

Related Questions in PROOF-WRITING

Related Questions in GAME-THEORY

Related Questions in ECONOMICS

Related Questions in NASH-EQUILIBRIUM

Trending Questions

Popular # Hahtags

Popular Questions