The LP formulation of a 2 player 0-sum game

217 Views Asked by At

I am studying Nash equilibria for 2 player 0-sum games from the book "Multiagent Systems" by Shoham and Leyton-Brown. In chapter 4 there is the following Linear Program (LP) formulation for the primal program (describing a minmax strategy of player 2 at equilibrium) :

enter image description here

By $\alpha_i^j$ they denote the action $j$ of player $i$ and by $s_i^j$ the probability that the action is played by the player.

I am trying to understand why it suffices to consider only pure strategies in (4.2) for player 1. (Similarly they consider only pure strategies for player 2 in the dual program)

Could you please assist what is the argument here ?

2

There are 2 best solutions below

0
On BEST ANSWER

Payoff functions are linear in every player's mixed strategies. The payoff player 1 gets from a mixed strategy is simply a weighted average of the payoff of their pure strategies. So the highest payoff player 1 could achieve is necessarily achieved at a pure strategy. If inequality (4.2) holds, it would also hold for all mixed strategies of player $1$. However, it is more convenient to work with pure strategies so that the inequalities are actually linear.

0
On

Player 1's minmax value is determined by player 2 playing the strategy that "punishes player 1 the most" by limiting player 1's best outcome as much as possible, in expectation. The main idea behind using an LP to compute player 2's minmax strategy against player 1 is the following relation: $$ v_1^*=\min_{s_2}\max_{s_1} \mathrm{E}[u_1(a_1,a_2))]=\min_{s_2}\max_{s_1} s_1^\top A s_2 = \min_{s_2}\max_{i=1}^m e_i^\top A s_2\tag{$*$} $$ where $A$ is an $m\times n$ matrix of payoffs for player 1, with rows being player 1 actions and columns being player 2 actions. The interpretation for $(*)$ is that player 2 must choose the "punish" strategy before player 1, and once player 2's strategy is known, player 1 can always play a pure strategy best response (though there may be mixed strategy that yields the same value, i.e., a mixed strategy for player 1 may also be a best response).

From $(*)$, the problem of finding $v_1^*$ can now be phrased as an LP for player 2 by using the epigraph trick to linearize player 1's inner maximization, i.e., $$ v_1^* \equiv \min_{s_2,\,v_1} \Big\{v_1: v\geq\sum_{j=1}^n A_{ij}s_2(j) ,\;i=1,2,\ldots,m,\;\; s_2\geq0, \;\; \mathbf{1}^\top s_2=1\Big\}. $$