The two strategies playing the game are Tit-for-Tat (TFT) and Psycho (Psy). I am asked to show that if TFT and Psy play IPD with random n, then on average the expected difference in payoffs is positive for Psy and negative for TFT. I have included strategy descriptions below.
In the previous part of the question, I showed by induction that TFT can do no better than tie with Psy. I am assuming that expected difference in payoffs is just the difference in expected probability for n rounds, but I am not quite sure how to calculate it. I know $$ E(X) = \sum_{1}^{n} xP(X) $$ but how do you expand it to take in varying values of two strategies in a game. Also, would the probabilities be 1/4 or would they be strategy-dependent (i.e. because, for example, we know that Psy will do the opposite of TFT, would the probability be 1/2 instead of 1/4)?
I've looked online and in the textbook, but I cannot seem to find a formula.
TFT strategy:
- Nice: always cooperates on first round
- Provocative: always defects if opponent defects in previous round
- Forgiving: always cooperates if opponent cooperates again
- Simple: other strategies can adapt to it
Psy strategy:
- Always defects on first round
- Does opposite of what opponent did last round
The strategies you list are sound completely deterministic. So the first round TFT cooperates and PSY defects, the second round TFT defects and PSY defects, the third round TFT defects and PSY cooperates, and the fourth round TFT cooperates and PSY cooperates, the fifth round TFT cooperates and PSY defects, and from there it repeats.
So you know by symmetry that the score evens out after the four round cycle. You are equally likely to stop at any point in one of these four round sequences. If you stop after the first round, PSY is ahead, the second round PSY is still ahead, the third round the score becomes even and the fourth round it's still even. So on average, when you stop, PSY is ahead.