I am looking at the dynamic programming principle in optimal control problems. I am reading a book on the subject. This is the statement of the problem and approach. The books is "Non-Cooperative Stochastic Differential Game Theory of Generalized Markov Jump Linear Systems" Cheng-Ke Zhang et. al.
The dynamics is given by,
$\dot{x} = f(t, x, u), \ x(0) = x_0$
We are trying to optimize a PI.
$\underset{u}{\min} \left[ \int_0^T g(x, x(s), u(s)) ds + q(x, T) \right]$
where $u$ are admissible controls.
The dynamic programming principle (Chapter 2, section 2.1.1 Dynamic Programming).
The dynamic programming principle can be stated as follows (pg. 18).
"A set of controls $u^*(t) = \phi^*(t,x)$ constitutes an optimal control solution to the control problem stated above, if there exists a continuous differentiable functions $V(t,x)$ defined on $[0,T] > \times \mathbb{R}^n \mapsto \mathbb{R}$ and satisfying the Bellman equation,
$-V_t(t,x) = \underset{u}{\min} \left[ g(t, x, y, u) + V_x(t,x) f(t, > x, u) \right]$ $= \left[ g(t, x, y, \phi*(t,x)) + V_x(t,x) f(t, x, > \phi^*(t,x)) \right]$ $V(T,x) = q(x)$"
The proof goes as follows,
$V = \underset{u}{min} \left[ \int_0^T g(s, x(s), u(s)) ds + q(x,T) \right]$, satisfying the boundary condition $V(T, x^*(T)) = q(x^*(T))$, and $\dot{x}^*(s) = f(x, x^*(s), \phi^*(s, x^*(s)), \ x^*(0) = x_0$. Consider another set of strategies $u(s) \in \mathcal{U}_m$, with corresponding trajectories $x(s)$, then from the bellman condition we have,
$g(t, x, u) + V_x(t, x) f(t, x, u) + V_t(t, x) \ge g(t, x^*, u^*) + V_{x^*}(t, x^*) f(t, x^*, u^*) + V_t(t, x^*)$
Now the book claims that integrating by parts produces the following result, $\int_0^T g(s, x(s) ds + V(T, x(T)) - V(t_0, x_0) \ge \int_0^T g(s, x^*(s), u^*(s)) ds + V(T, x^*(T)) - V(t_0, x_0)$
The question is how did we get to this result? Integrating the $V_x$ term we have, $\int_0^T V_x(s, x) f(s, x, u) ds = \int_0^T V_x(s, x) \frac{d x}{d s} ds = \int_0^T V_x(s,x) dx = V(T, x(T)) - V(0, x_0)$
We ge a similar term from the integral $\int_0^T V_s(s, x) ds = V(T, x(T)) - V(0, x_0)$. What am I missing? When you put it all together we end up with,
$\int_0^T g(s, x(s) ds + 2V(T, x(T)) - 2V(t_0, x_0) \ge \int_0^T g(s, x^*(s), u^*(s)) ds + 2 V(T, x^*(T)) - 2 V(t_0, x_0)$
Thank you in advance for all the responses.
I believe I figured out the answer. The answer is pretty straight forward. Here it goes.
$g(t, x, u) + V_x(t,x) f(t,x,u) + V_t(t,x) = g(t,x,u) + V_x(t,x) \dot{x} + V_t(t,x) = g(t,x,u) + dV(t,x)$
where
$dV(t,x) = V_x \dot{x} + V_t$