Let $A$ a compact subset of $\mathbb{R}^m$. Let $f:\mathbb{R}^n\times A\to \mathbb{R}^n$ and $l:\mathbb{R}^n\times A\to \mathbb{R}$ such that
$f$ is continuous, bounded and there exist $L_f>0$ such that $$\left\lVert f(x_1,a)-f(x_2,a)\right\rVert \leq L_f\left\lVert x_1-x_2\right\rVert$$ for all $x_1,x_2 \in \mathbb{R}^n$ and $a\in A$.
$l$ is continuous, bounded and there exist a modulus of continuity $w_l$ such that $$|l(x_1,a)-l(x_2,a)| \leq w_l(\left\lVert x_1-x_2\right\rVert)$$ for all $x_1,x_2 \in \mathbb{R}^n$ and $a\in A$.
Under the hypothesis for the function $f$ if we set $$\mathcal{A}:=\{\alpha:[0,+\infty)\to A\,:\,\,\,\alpha \,\text{is Lebesgue measurable}\},$$ then for every initial state $x\in \mathbb{R}^n$ and every $\alpha \in \mathcal{A}$ there exist a unique function $y_x(\cdot;\alpha):[0,+\infty)\to\mathbb{R}^n$ such that $$y_x(t;\alpha)=x+\int_{0}^{t}f(y_x(s;\alpha),\alpha(s))ds\qquad t\geq 0.$$
Now fix $\lambda>0$ and consider the Cost functional $$J(x,\alpha)=\int_{0}^{+\infty}l(y_x(s;\alpha),\alpha(s))e^{-\lambda s}ds\qquad x\in\mathbb{R}^n,\alpha \in \mathcal{A}.$$
Finally, the Value function is defined as $$V(x):=\inf_{\alpha \in \mathcal{A}}J(x,\alpha)\qquad x \in \mathbb{R}^n.$$
Under the hypothesis on $f,l$ it is simple to show that $V$ is bounded and uniformly continuous on $\mathbb{R}^n$.
$V$ satisfies also the so called Dynamic Programming Principle, ie
Theorem For each initial data $x \in \mathbb{R}^n$ and any time $t > 0$ we have $$V(x)=\inf_{\alpha \in \mathcal{A}}\left\{\int_{0}^{t}l(y_x(s;\alpha),\alpha(s))e^{-\lambda s}ds+e^{-\lambda t}V(y_x(t;\alpha)) \right\}.$$
I have to prove the following:
Lemma If $V\in C^1(\mathbb{R}^n)$ then $$\lambda V(x)+\sup_{a \in A}\left\{-DV(x)\cdot f(x,a)-l(x,a)\right\}=0\qquad x \in \mathbb{R}^n$$ where $DV$ is the gradient of $V$ and $\cdot$ is the usual euclidean scalar product.
That's what I understand so far:
Take any constant control $\alpha(\cdot) = a \in A$, by the Dynamic programming Principle we have $$V(x)\leq\int_{0}^{t}l(y_x(s;a),a)e^{-\lambda s}ds+e^{-\lambda t}V(y_x(t;a)), $$ for all $t>0$. Thus $$V(x)-e^{-\lambda t}V(x)+e^{-\lambda t}V(x)-e^{-\lambda t}V(y_x(t;a))- \int_{0}^{t}l(y_x(s;a),a)e^{-\lambda s}ds\leq 0.\qquad [1]$$ Since $\alpha=a$ is continuous, then the corresponding trajectory $y_x(\cdot;a)$ is $C^1([0,+\infty))$ and $${y'}_x(t;a)=f(y_x(t;a),a)\qquad t\geq 0 .$$ Then $$V(y_x(t;a))-V(x)=\int_{0}^{t}\frac{d}{ds}V(y_x(s,a))ds=\int_{0}^{t}DV(y_x(s,a))\cdot {y'}_x(s,a)ds=\int_{0}^{t}DV(y_x(s,a))\cdot f(y_x(s;a),a)ds.$$ But then from $[1]$ we have, also dividing by $t>0$, $$\left(\frac{1-e^{-\lambda t}}{t}\right)V(x)+\\+\frac{1}{t}\int_{0}^{t}\left(-e^{-\lambda t}DV(y_x(s,a))\cdot f(y_x(s;a),a)-l(y_x(s,a),a)e^{-\lambda s}\right)ds\leq 0.$$
Letting $t \to0$ thanks to the regularity assumptions on $f$ and $l$ we have for each $a\in A$ $$\lambda V (x) − DV (x) \cdot f (x, a) − l(x, a)\leq 0,$$ therefore, taking the supremum over the constant controls $a\in A$ we get $$\lambda V (x) + \sup_{a\in A}\{−DV (x) \cdot f (x, a) − l(x, a)\} \leq 0.$$
Now comes my troubles with the reverse inequality. My teacher says that with similar calculation by the dynamic programming principle we can obtain the reverse inequality and finally prove the Lemma, but I really don't understand how to proceed.
Can anyone give me some help? It would be really appreciated!!!
Here is a hint for you.
Let $\epsilon > 0$, then there exists $\alpha \in \mathcal{A}$ such that:
$$ V(x) + \epsilon t \geq \left\{\int_{0}^{t}l(y_x(s;\alpha),\alpha(s))e^{-\lambda s}ds+e^{-\lambda t}V(y_x(t;\alpha)) \right\}. $$
Or,
$$ V(x) - e^{-\lambda t}V(y_x(t;\alpha))- \int_{0}^{t}l(y_x(s;\alpha),\alpha(s))e^{-\lambda s}ds \geq -\epsilon t. $$
The following still holds, a.e.: $$ V(y_x(t;\alpha))-V(x)=\int_{0}^{t}\frac{d}{ds}V(y_x(s,\alpha))ds=\int_{0}^{t}DV(y_x(s,\alpha))\cdot {y'}_x(s,\alpha)ds=\int_{0}^{t}DV(y_x(s,\alpha))\cdot f(y_x(s;\alpha),\alpha(s))ds.$$
Hence,
$$ (1 - e^{-\lambda t})V(y_x(t;\alpha))+ \int_{0}^{t}l(y_x(s;\alpha),\alpha(s))(1-e^{-\lambda s}) ds - \int_{0}^{t}( l(y_x(s;\alpha),\alpha(s)) + DV(y_x(s,\alpha))\cdot f(y_x(s;\alpha),\alpha(s)))ds \geq -\epsilon t. $$
Let's look at these three terms: $$ \frac{1}{t}(1 - e^{-\lambda t})V(y_x(t;\alpha)) = \lambda V(x) + o(t).$$
Boundedness of $l$ gives:
$$\frac{1}{t}\int_{0}^{t}l(y_x(s;\alpha),\alpha(s))(1-e^{-\lambda s}) ds = o(t).$$
For the last term: $$ -\frac{1}{t} \int_{0}^{t}( l(y_x(s;\alpha),\alpha(s)) + DV(y_x(s,\alpha))\cdot f(y_x(s;\alpha),\alpha(s)))ds = -\int_{0}^{t}( l(x,\alpha(s)) + DV(x)\cdot f(x,\alpha(s)))ds + o(t) \leq \sup_{a\in A}\{−DV (x) \cdot f (x, a) − l(x, a)\} + o(t).$$
This requires a bit more work but essentially can be obtained using Lipschitz condition on $f$ and $L$.
Finally,
$$ \lambda V(x) + \sup_{a\in A}\{−DV (x) \cdot f (x, a) − l(x, a)\} + o(t) \geq -\epsilon , $$
from which it is easy to conclude.
Let me know if you need more details