I was working through the example presented in this video, https://youtu.be/EZlzB0ujoP8?t=141, discussing a two-stage game consisting of a prisoner's dilemma in the first stage and a free money game in the second stage.
Stage 1 game:
P2
c d
-------------
c | 3,3 | 1,4 |
P1 -------------
d | 4,1 | 2,2 |
-------------
Stage 2 game:
P2
r p
-------------
r | 3,3 | 0,0 |
P1 -------------
p | 0,0 | 0,0 |
-------------
Note that the unique NE in the stage 1 game is (defect, defect) resulting in a payoff of (2,2). The stage 2 game has two NE, (reward, reward) resulting in a payoff of (3,3) and (punish,punish) resulting in a reward of (0,0). The author of the video illustrates that one subgame perfect equilibrium of the two-stage game is playing the NE (d,d) in the first stage then playing (r,r) in the second stage, resulting in a total payoff of (3,3)+(2,2)=(5,5).
The author then states that it is possible to do strictly better by using the punishment NE in the second stage to enforce cooperation in the first stage. In the proposed subgame perfect equilibrium, we only reward (that is, we only play (r,r)) if we cooperate in the first stage. This results in a payoff of (3,3)+(3,3)=(6,6).
My issue is, how is this a credible threat? Wouldn't the players know that even if a player defected in the first stage, then they are better off choosing (r,r) regardless, instead of (p,p)?
Your objection reads almost like an objection to (Pareto) dominated equilibria.
In the one-shot prisoner's dilemma (imagine players play only the stage $1$ game once), players are better off choosing $(C,C)$ than the NE $(D,D)$. Similarly, in the stage $2$ game, players are better off choosing $(R,R)$ than $(P,P)$, but $(P,P)$ is still a NE.
Recall that we've fixed the strategies of both players to be "play $C$ in stage $1$, then play $R$ in stage $2$ if the moves in stage $1$ were $(C,C)$, and play $P$ otherwise".
Playing $P$ in stage $2$ when the stage $1$ outcome was not $(C,C)$ is credible because it is a best response to the strategy described above. In particular, a player is willing to play $P$ because they expect their opponent to also play $P$.
They would be better off if they both chose $R$ instead, but this is a non-cooperative game: players choose their strategies individually taking as given the other player's strategy. (Hence the emergence of dominated equilibria that I mention in the second paragraph above.)