Showing the value functions for an N-player differential game solve a coupled system of parabolic PDE

29 Views Asked by At

I'm interested in deriving this system: $$\begin{cases}-\partial_t v^{N,i} - \sum_{j} \Delta_{x_j}v^{N,j} + \sum_{j \neq i} D_{x_j}v^{N,j} \cdot D_{x_j}v^{N,i} + \frac12 |D_{x_i}v^{N,i}|^2 = F^i(\textbf{x}) \\ v^{N,i}(\textbf{x}, T) = G^i(\textbf{x}).\end{cases}$$

Let me set up the Mean field game: For $j\neq i$, we assume that the players' states evolve according to the dynamics $$\begin{cases} dX_s^j = \alpha^{j, \ast} \ dt + \sqrt{2} \ dB_s^j, & s \ge t\\ X^j_t = x_j. \end{cases}$$

Here we have assumed that there exists a unique Nash equilibrium $(\alpha^{j, \ast})_{1\le j \le N}$.

Let $\mathcal{A}_{t,\tau}$ be the set of $L^2([t,\tau])$ controls mapping into $\mathbb{R}^d$. The value function for player $i$ is given by $$v^i(\textbf{x},t) = \inf_{\alpha^i \in\mathcal{A}_{t, T}} \mathbb{E} \left[\int_t^{T} \frac12 |\alpha_s^i|^2 + F^i(\textbf{X}_s) \ ds + G^i(\textbf{X}_t)\right].$$

Here $\textbf{x} = (x_1, \ldots x_N) \in \mathbb{R}^{dN}$.

Recall the stochastic dynamic programming principle: $$v^i(\textbf{x},t) =\inf_{\mathcal{A}_{t, t+h}} \mathbb{E} \left[\int_t^{t+h} \frac12 |\alpha_s^i|^2 + F^i(\textbf{X}_s) \ ds + v^i(\textbf{X}_{t+h}, t+h)\right].$$

Ito's chain rule yields $$d v^i(\textbf{X}_t, t) = \partial_t v^i \ dt + \nabla_{\textbf{x}} v^i \cdot d \textbf{X}_t + \sum_{j=1}^N \Delta_{x_j}v^i$$ $$= \left(\partial_t v^i + \sum_{j \neq i} \nabla_{x_j}v^i \cdot \alpha^{j, \ast}_t +\nabla_{x_i}v^i \cdot \alpha_t^i+ \sum_{j=1}^N \Delta_{x_j}v^i \right) \ dt + \sum_{j=1}^N \nabla_{x_j}v^i \ dB_t^i.$$ This allows us to express $v^i(\textbf{X}_{t+h}, t+h)$ in terms of integrals from $t$ to $t+h$. Plugging into the expression given by the stochastic dynamic programming principle, and using that integration against Brownian motion defines a martingale, and subtracting away the $v^i(\textbf{x}, t)$, we have $$0 = \inf_{\mathcal{A}_{t, t+h}} \mathbb{E}\left[ \int_t^{t+h} \frac12 |\alpha_s^i|^2 + F^i(\textbf{X}_s) + \partial_t v^i + \sum_{j \neq i} \nabla_{x_j}v^i \cdot \alpha^{j, \ast} +\nabla_{x_i}v^i \cdot \alpha_s^i+ \sum_{j=1}^N \Delta_{x_j}v^i \ ds\right].$$ Dividing by $h$ and sending $h \to 0$, we derive $$0 = F^i(\textbf{x}) + \partial_t v(\textbf{x},t) + \sum_{j=1}^N \Delta_{x_j}v^i(\textbf{x}, t) + \sum_{j \neq i} \nabla_{x_j}v^i \cdot \alpha^{j,\ast} + \inf_{y \in \mathbb{R}^d} \{\frac12 |y|^2 - \nabla_{x_i}v^i(\textbf{x},t) \cdot y\}.$$ But this infimum can be directly computed to be seen as $$-\frac12 |\nabla_{x_i} v^i(\textbf{x},t)|^2.$$

Here is my question: how can I make the limit step rigorous? I am not really sure where to begin. The problem seems to be that the set we are taking an infimum over here is changing with $h$. We can assume all the regularity on $v$ that we like.