How do I show that this game played on a Markov chain has a unique Nash equilibrium?

785 Views Asked by At

There are $k$ stages in this game, and each stage is worth one unit of utility to a player (of which there are $n$). Each player $i$ finishes stages at a rate $\lambda_i$ (in a continuous time Markov chain). Thus, writing $\lambda = \lambda_1 +\ldots+\lambda_n$, player $i$ is the first to finish the first stage with probability $\lambda_i/\lambda$. If utility was awarded immediately, player $i$ would get her unit of utility and everyone would move on to stage 2. Each player would then simply have an expected utility of $k\cdot \lambda_i/\lambda$.

Here is the complication. Player $i$ can choose not to claim her utility. If she does this, she risks another player finishing the first stage and claiming the utility instead of her, but the potential benefit is that she can finish one or more subsequent stages before the other players catch up.

I assume utility is always claimed when stage $k$ is finished. So the strategies available to each player are, for $1,\ldots,k-1$, whether to claim utility when you finish that stage (strategy $E$) or not (strategy $H$). Note that strategies are allowed to vary per stage but not based on how much utility has already been claimed by other players or how many stages they have finished. Here is a picture of the game tree for $n = 2$ and $k = 2$ (where $N$ is Nature).

Game tree

In this example with $n = 2$ and $k = 2$ the payoff matrix looks like this.

\begin{equation*} \begin{array}{rcc} & E & H \\ %\hline E & \left(2\frac{\lambda_1}{\lambda},2\frac{\lambda_2}{\lambda}\right) & \left(2\frac{\lambda_1}{\lambda} + \frac{\lambda_1^2}{\lambda^2}\frac{\lambda_2}{\lambda},2\frac{\lambda_2}{\lambda} - \frac{\lambda_1^2}{\lambda^2}\frac{\lambda_2}{\lambda}\right) \\ H & \left(2\frac{\lambda_1}{\lambda} - \frac{\lambda_1}{\lambda}\frac{\lambda_2^2}{\lambda^2},2\frac{\lambda_2}{\lambda} + \frac{\lambda_1}{\lambda}\frac{\lambda_2^2}{\lambda^2}\right) & \left(2\frac{\lambda_1^2}{\lambda^2} + 4\frac{\lambda_1^2}{\lambda^2}\frac{\lambda_2}{\lambda},2\frac{\lambda_2^2}{\lambda^2} + 4\frac{\lambda_1}{\lambda}\frac{\lambda_2^2}{\lambda^2}\right) \end{array} \end{equation*}

In the example the unique Nash equilibrium is for both players to play strategy $E$. The intuition behind this result (the way I see it) is that due to the probabilistic independence between stage completions, the probability of "runs" where you finish multiple stages in a row is too small to make it worth the risk that someone scoops you for the utility you have "in hand".

I'm trying to show that this generalizes for all $n$ and $k$. It's fairly easy to show that all players playing strategy $E$ at all stages is a Nash equilibrium. What I want to show is that it's the unique Nash equilibrium, i.e., that any profile in which at least one player plays strategy $H$ at at least one stage is not a Nash equilibrium (I'm hopeful that the correct answer to that question rules out mixed-strategy equilibria as well).

What makes it tricky is that it's not just always better for any player to move to the "always $E$" strategy. For example, when everyone is playing "always $H$", in general only the slowest player (lowest value of $\lambda_i$) has an incentive to switch (I did manage to prove this in general).

So something like the following might work: given a profile, the slowest player that is not already playing "always $E$" has an incentive to switch. I've tried to show that the slowest player always has an incentive to switch to "always $E$", or that the slowest player always has an incentive to at least play $E$ on the first stage, or that the slowest player always has an incentive to at least play $E$ on the last stage ($k-1$). In each case I get bogged down in the details.

I'm looking for a new approach or a way of looking at it. Even just a way of writing down the problem with less words and more mathematical notation might give new insight. I'm happy to say more about what I've tried but this is already quite long so I'll stop here. Thanks in advance for your help!

1

There are 1 best solutions below

0
On BEST ANSWER

So I finally figured this out. Note the following:

  • The game is zero-sum: regardless of the strategies played by each player, at the end of the game the sum of all players' payoff is $k$.
  • As noted in the question, if all players play strategy $E$ at all their information sets (the equilibrium), the expected payoff to player $i$ is $k\cdot\lambda_i/\lambda$.
  • In any strategy profile in which player $i$ plays $E$ at all her information sets, but at least one player does not, player $i$'s payoff is strictly greater than $k\cdot\lambda_i/\lambda$. This is because, by playing $E$ at every information set, she expects to get $k\cdot\lambda_i/\lambda$ during the first $k$ stage completions (regardless of whether other players are claiming their utility). But because, with positive probability, the game is not over after the first $k$ stage completions, she gets some extra utility with positive probability. This is the crucial step, and I'm happy to elaborate.
  • Now consider any strategy profile in which at least one player is playing a non-equilibrium strategy with positive probability. Each player $i$ who is playing the equilibrium strategy is getting a greater expected payoff than they would in equilibrium by the previous bullet point. Because the game is zero-sum, at least one of the players is worse off than she would be in equilibrium. This player's expected payoff would be improved if she switched to the equilibrium strategy. So the strategy profile under consideration is not a Nash equilibrium.
  • Hence, the strategy profile in which every player plays the pure strategy of "$E$ at every information set" is the unique Nash equilibrium of this game.