What's Being Returned Here?

49 Views Asked by At

I'm working my way through this paper, and I'm having a bit of trouble understanding what it's telling me to do. Here's the specific excerpt that's tripping me up:

A (finite) one-shot game is a tuple $\Gamma = \langle N, A, R\rangle$ in which $N$ is a finite set of $n$ players; $A = \Pi_{i\in N}A_i$, where $A_i$ is player $i$’s finite set of pure actions; and $R : A → \mathbb{R}^n$, where $R_i(a)$ is player $i$’s reward at action profile $a \in A$.

Once again, imagine a referee who selects an action profile a according to some policy $\pi \in \Delta(A)$. The referee advises player $i$ to follow action $a_i$. Define $A−i = \Pi_{j\neq i}A_j$. Define $\pi(a_i) = \Sigma_{a_{-i}\in A_{-i}}\pi(a_{-i},a_i)$ and $\pi(a_{-i} \mid a_i) = \frac{\pi(a_{-i},a_i)}{\pi(a_i)}$ whenever $\pi(a_i) > 0$.

For all $i \in N$ and for all $a_i,a_i^\prime \in A_i,$

$$\sum_\limits{a_{-i}\in A_{-i}}\pi(a_{-i},a_i)R_i(a_{-i},a_i) \geq \sum_{a_{-i}\in A_{-i}}\pi(a_{-i},a_i)R_i(a_{-i},a_i^\prime)$$

What is being returned by the $\pi(a_{-i},a_i)$? I don't really understand if it's supposed to be a single value or a series of values. Maybe it's because I'm not entirely sure if I understand what $(a_{-i},a_i)$ means. Is it just a way of differentiating agent $i$? And if it is, why is that done?

EDIT: I figured out that $\pi$ is supposed to be probability distributions for each action, so now I understand at least generally what is supposed to be returned, but I'm still struggling with the $(a_{-i},a_i)$ part.

1

There are 1 best solutions below

0
On

$\pi$ takes in an element of the power set of all actions of all players. A slightly less abusive notation would be $\pi(\{a_i\})$ $\pi(\{a_i, a_{-i}\})$. $A_{-i}$ is an arbitrary opponent of $i$.

My intuition on how that powerset works is $\pi(\{a_i\})$ is the probability of $a_i$ being taken, regardless of action by opponent, and $\pi(\{a_i, a_{-i}\})$ is the probability of $a_i$ being taken and $a_{-i}$ being taken as well, so $\pi(\{a_i\}) \geq \pi(\{a_i,a_{-i}\})$.