How does best response imply indifference?

1.5k Views Asked by At

I recently read Leyton/Brown's Essentials of Game Theory and I keep coming back to this (Sec. 2.2):

When the support of a best response $s∗$ includes two or more actions, the agent must be indifferent among them—otherwise, the agent would prefer to reduce the probability of playing at least one of the actions to zero. But thus any mixture of these actions must also be a best response, not only the particular mixture in $s∗$. Similarly, if there are two pure strategies that are individually best responses, any mixture of the two is necessarily also a best response.

This is followed by an example on the next page of using this observation as the basis of computing a mixed-strategy Nash equilibrium.

I get it, but I don't get it "in my gut", which is troubling. It just seems to sound highly counter-intuitive that having a "best response" implies that we are thus "indifferent" in our response. I've tried to make this feel natural, but I haven't succeeded.

Does anyone have a better explanation/analogy/picture by which this can be made memorable and intuitive?

3

There are 3 best solutions below

0
On

Sure. Let's make a simple example. What if you and I both, on count of three, put down a penny on the table. If the faces match, I pay you one dollar, and if they don't, you pay me one dollar. Moreover, let's say you know I am randomizing 50-50 heads vs. tails. Given I am doing that: $$U_{\textrm{You}}(\textrm{Heads}) = \underbrace{1\cdot (0.5)}_{\textrm{Payoff if I end up heads}} -\underbrace{1\cdot (0.5)}_{\textrm{Payoff if I end up tails}} = 0$$

$$U_{\textrm{You}}(\textrm{Tails}) = \underbrace{1\cdot (0.5)}_{\textrm{Payoff if I end up tails}} -\underbrace{1\cdot (0.5)}_{\textrm{Payoff if I end up heads}} = 0$$

But moreover, any mixture for you also gets you zero expected profit!

Edit: I should say that this critically depends on the risk-neutrality of the utility function assumed in classical game theory. In particular, it is assumed that one has utility values for outcomes (a vector of actions, i.e. pure strategies), and that this utility function is extended linearly to the space of vectors of randomizations (this is known as a von Neumman-Morgenstern utility function). It is precisely this linearity of the definition that means that if two outcomes yield identical utilities, than any convex combination of them (i.e. randomization between them) yields the same. Should the agent, say, be risk averse, this would not be true.

0
On

Assume, to the contrary, that the agent is not indifferent between two pure strategies $S_1$ and $S_2$ in the support of their best response. Let's say $S_1\succ S_2$, this implies $u(S_1)>u(S_2)$. But then

$$p_1u(S_1)+p_2u(S_2)<1u(S_1)+0u(S_2)$$

for any $0<p_1<1$ (it follows that $p_2=1-p_1\neq0$). For any reasonable definition of best response, it should be that $p_2$ should be $0$, and hence not in the support of the agent's best response, contradiction.

We have established that the agent is indifferent between every pure strategy $S_i$, $i\in I$ in the support of their best response, i.e. $S_i\sim S_j$ for all $i,j\in I$, which implies $u(S_i)=u(S_j)$ for all $i,j\in I$. Now let's take any two probability distributions $p$ and $q$ over $S_i$. It follows immediately that

$$\sum_{i\in I}p_iu(S_i)=\sum_{i\in I}q_iu(S_i)$$

since $\sum_{i\in I}p_i=\sum_{i\in I}q_i=1$. So the agent is indifferent between any two probability distributions over $S_i$.

More concretely, it's like why would the agent want a (non-degenerate) lottery between \$5 and \$1 if they could just receive a guaranteed \$5? And now let's say outcomes $O_i$, $i\in I$ are currencies by which the agent believes are worth exactly the same (by exchange rates or some other means). Then the agent wouldn't care how their currency is split up as long as they believe the total value is the same.

1
On

Here is an intuitive answer. Imagine a box of chocolate pralines: some are dark chocolate, some are white chocolate, some have a nougat filling, and so on.

Your "best response" is picking the flavour you like the most; say, the nougat-filled ones.

Because you are indifferent among any of the nougat-filled pralines, your best response is to pick any of them. If randomised strategies are allowed, any probability distribution over the nougat-filled pralines is also a best response. Choosing randomly over your preferred pralines works as good as choosing one of them. (This may even be what happens in your brain when your hand picks one of the nougat-filled pralines.)