Is there always a pure strategy to be the best response to a mixed strategy in a two-player zero-sum game?

385 Views Asked by At

The game is characterized by a matrix M. In this game, the player chooses the row $i$ of $M$, and the adversary chooses the column $j$ of $M$. They make their decisions simultaneously.

Suppose the player chooses a row from some distribution $p$ which is known to the adversary. Then the adversary wants to choose a distribution $q^\*$ so as to maximize its expected reward, denoted as $q^\*=argmax_{q}\mathbb{E}_{i\in p,j\in q}[M(i,j)]$.

In my view: the best response to a mixed strategy should also be a mixed strategy. But I read one book which states that any column $j$ would maximize this reward too. Namely, $max_{q}\mathbb{E}_{i\in p,j\in q}[M(i,j)] = max_{j}\mathbb{E}_{i\in p}[M(i,j)]$.

This is my question: Is there always a pure strategy to be the best response to a mixed strategy in a two-player zero-sum game? Or in any more general setting? If possible, please provide the references.

Thanks!

2

There are 2 best solutions below

4
On BEST ANSWER

Thanks to Mihaild's answer and these helpful comments! Let me try to answer this question.

Given the player's mixed strategy $p$, suppose the adversary's mixed strategy is $q$, then the expected reward he can get is $E_{i\in p,j\in q}(M(i,j))=\sum_{i,j}p_i q_j M(i,j)=\sum_j q_j(\sum_i p_i M(i,j))=\sum_j p_j E_{i \in p}(M(i,j))$. We denote $E_{i \in p}(M(i,j))$ as $M(p,j)$.

There is definitely a relative order on $j$ according to $M(p,j)$. For example, $M(p,j_0)<M(p,j_1)<\dots<M(p,j_m)$. Since the adversary aims to maximize the expected reward, he would better assign probability $q_{j_m}=1$ to $j_m$.

Besides, there always exists a set $J$ such that $\forall j^*\in J,\forall j,M(p,j^*)\geq M(p,j)$, and the size of set $|J|\geq 1$. And $\forall j^*_1,j^*_2 \in J, M(p,j^*_1)=M(p,j^*_2)$. Thus, the best response to $p$ can be arbitrary distribution over the set $J$, no matter a distribution (mix strategy) or a pure strategy. And at least one pure strategy exists.

0
On

If player has a fixed mixed strategy, and adversary has a best response to it, then any active (having non-zero probability) pure strategy should be the best response to it.

It follows from simple fact that expected reward from a mixed strategy is convex combination of expected rewards from active pure strategies with weights equal to probabilities. And if some of active pure strategies has lower expected reward than other, then, by dropping this pure strategy in favour of other with greater expected reward, adversary increases their reward.

In formulas, $E_{i \in p, j \in q} M(i, j) = \sum_j p_q(j) \cdot E_{i \in p} M(i, j)$ (where $p_q(j)$ is probability $q$ assigns to $j$), and, if we have $p_q(j_0) > 0$, $E_{i \in p} M(i, j_0) < E_{i \in p, j \in q} M(i, j)$, then for some $j_1$ we necessary have $E_{i \in p} M(i, j_1) > E_{i \in p, j \in q} M(i, j)$ and thus strategy $q'$ s.t. $$p_{q'}(j) = \begin{cases} 0,& j = j_0\\ p_q(j_0) + p_q(j_1),&j = j_1\\ p_q(j) \end{cases}$$ gives reward higher than $q$ by value $p_q(j_0) \cdot (E_{i \in p} M(i, j_1) - E_{i \in p} M(i, j_0))$.

So, as no active pure strategy gives reward lower than expected reward for mixed strategy, all active pure strategies give the same reward as the mixed (if it's best response), and so are best responses themselves.

Note, however, that if player playing $p$ and adversary playing $q$ is Nash equilibria, it's not necessary that player playing $p$ and adversary playing some pure strategy active in $q$ is Nash equilibria.