On the Bellman equation and smooth Value function

Question

On the Bellman equation and smooth Value function

216 Views Asked by Bumbble Comm At 10 May 2026 - 3:07

Let $A$ a compact subset of $\mathbb{R}^m$. Let $f:\mathbb{R}^n\times A\to \mathbb{R}^n$ and $l:\mathbb{R}^n\times A\to \mathbb{R}$ such that

$f$ is continuous, bounded and there exist $L_f>0$ such that $$\left\lVert f(x_1,a)-f(x_2,a)\right\rVert \leq L_f\left\lVert x_1-x_2\right\rVert$$ for all $x_1,x_2 \in \mathbb{R}^n$ and $a\in A$.
$l$ is continuous, bounded and there exist a modulus of continuity $w_l$ such that $$|l(x_1,a)-l(x_2,a)| \leq w_l(\left\lVert x_1-x_2\right\rVert)$$ for all $x_1,x_2 \in \mathbb{R}^n$ and $a\in A$.

Under the hypothesis for the function $f$ if we set $$\mathcal{A}:=\{\alpha:[0,+\infty)\to A\,:\,\,\,\alpha \,\text{is Lebesgue measurable}\},$$ then for every initial state $x\in \mathbb{R}^n$ and every $\alpha \in \mathcal{A}$ there exist a unique function $y_x(\cdot;\alpha):[0,+\infty)\to\mathbb{R}^n$ such that $$y_x(t;\alpha)=x+\int_{0}^{t}f(y_x(s;\alpha),\alpha(s))ds\qquad t\geq 0.$$

Now fix $\lambda>0$ and consider the Cost functional $$J(x,\alpha)=\int_{0}^{+\infty}l(y_x(s;\alpha),\alpha(s))e^{-\lambda s}ds\qquad x\in\mathbb{R}^n,\alpha \in \mathcal{A}.$$

Finally, the Value function is defined as $$V(x):=\inf_{\alpha \in \mathcal{A}}J(x,\alpha)\qquad x \in \mathbb{R}^n.$$

Under the hypothesis on $f,l$ it is simple to show that $V$ is bounded and uniformly continuous on $\mathbb{R}^n$.

$V$ satisfies also the so called Dynamic Programming Principle, ie

Theorem For each initial data $x \in \mathbb{R}^n$ and any time $t > 0$ we have $$V(x)=\inf_{\alpha \in \mathcal{A}}\left\{\int_{0}^{t}l(y_x(s;\alpha),\alpha(s))e^{-\lambda s}ds+e^{-\lambda t}V(y_x(t;\alpha)) \right\}.$$

I have to prove the following:

Lemma If $V\in C^1(\mathbb{R}^n)$ then $$\lambda V(x)+\sup_{a \in A}\left\{-DV(x)\cdot f(x,a)-l(x,a)\right\}=0\qquad x \in \mathbb{R}^n$$ where $DV$ is the gradient of $V$ and $\cdot$ is the usual euclidean scalar product.

That's what I understand so far:

Take any constant control $\alpha(\cdot) = a \in A$, by the Dynamic programming Principle we have $$V(x)\leq\int_{0}^{t}l(y_x(s;a),a)e^{-\lambda s}ds+e^{-\lambda t}V(y_x(t;a)), $$ for all $t>0$. Thus $$V(x)-e^{-\lambda t}V(x)+e^{-\lambda t}V(x)-e^{-\lambda t}V(y_x(t;a))- \int_{0}^{t}l(y_x(s;a),a)e^{-\lambda s}ds\leq 0.\qquad [1]$$ Since $\alpha=a$ is continuous, then the corresponding trajectory $y_x(\cdot;a)$ is $C^1([0,+\infty))$ and $${y'}_x(t;a)=f(y_x(t;a),a)\qquad t\geq 0 .$$ Then $$V(y_x(t;a))-V(x)=\int_{0}^{t}\frac{d}{ds}V(y_x(s,a))ds=\int_{0}^{t}DV(y_x(s,a))\cdot {y'}_x(s,a)ds=\int_{0}^{t}DV(y_x(s,a))\cdot f(y_x(s;a),a)ds.$$ But then from $[1]$ we have, also dividing by $t>0$, $$\left(\frac{1-e^{-\lambda t}}{t}\right)V(x)+\\+\frac{1}{t}\int_{0}^{t}\left(-e^{-\lambda t}DV(y_x(s,a))\cdot f(y_x(s;a),a)-l(y_x(s,a),a)e^{-\lambda s}\right)ds\leq 0.$$

Letting $t \to0$ thanks to the regularity assumptions on $f$ and $l$ we have for each $a\in A$ $$\lambda V (x) − DV (x) \cdot f (x, a) − l(x, a)\leq 0,$$ therefore, taking the supremum over the constant controls $a\in A$ we get $$\lambda V (x) + \sup_{a\in A}\{−DV (x) \cdot f (x, a) − l(x, a)\} \leq 0.$$

Now comes my troubles with the reverse inequality. My teacher says that with similar calculation by the dynamic programming principle we can obtain the reverse inequality and finally prove the Lemma, but I really don't understand how to proceed.

Can anyone give me some help? It would be really appreciated!!!

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Answer 1 · 2019-02-05 23:08:19

Here is a hint for you.

Let $\epsilon > 0$, then there exists $\alpha \in \mathcal{A}$ such that:
$$ V(x) + \epsilon t \geq \left\{\int_{0}^{t}l(y_x(s;\alpha),\alpha(s))e^{-\lambda s}ds+e^{-\lambda t}V(y_x(t;\alpha)) \right\}. $$

Or,

$$ V(x) - e^{-\lambda t}V(y_x(t;\alpha))- \int_{0}^{t}l(y_x(s;\alpha),\alpha(s))e^{-\lambda s}ds \geq -\epsilon t. $$

The following still holds, a.e.: $$ V(y_x(t;\alpha))-V(x)=\int_{0}^{t}\frac{d}{ds}V(y_x(s,\alpha))ds=\int_{0}^{t}DV(y_x(s,\alpha))\cdot {y'}_x(s,\alpha)ds=\int_{0}^{t}DV(y_x(s,\alpha))\cdot f(y_x(s;\alpha),\alpha(s))ds.$$

Hence,

$$ (1 - e^{-\lambda t})V(y_x(t;\alpha))+ \int_{0}^{t}l(y_x(s;\alpha),\alpha(s))(1-e^{-\lambda s}) ds - \int_{0}^{t}( l(y_x(s;\alpha),\alpha(s)) + DV(y_x(s,\alpha))\cdot f(y_x(s;\alpha),\alpha(s)))ds \geq -\epsilon t. $$

Let's look at these three terms: $$ \frac{1}{t}(1 - e^{-\lambda t})V(y_x(t;\alpha)) = \lambda V(x) + o(t).$$

Boundedness of $l$ gives:

$$\frac{1}{t}\int_{0}^{t}l(y_x(s;\alpha),\alpha(s))(1-e^{-\lambda s}) ds = o(t).$$

For the last term: $$ -\frac{1}{t} \int_{0}^{t}( l(y_x(s;\alpha),\alpha(s)) + DV(y_x(s,\alpha))\cdot f(y_x(s;\alpha),\alpha(s)))ds = -\int_{0}^{t}( l(x,\alpha(s)) + DV(x)\cdot f(x,\alpha(s)))ds + o(t) \leq \sup_{a\in A}\{−DV (x) \cdot f (x, a) − l(x, a)\} + o(t).$$

This requires a bit more work but essentially can be obtained using Lipschitz condition on $f$ and $L$.

Finally,

$$ \lambda V(x) + \sup_{a\in A}\{−DV (x) \cdot f (x, a) − l(x, a)\} + o(t) \geq -\epsilon , $$

from which it is easy to conclude.

Let me know if you need more details

On the Bellman equation and smooth Value function

There are 1 best solutions below

Related Questions in REAL-ANALYSIS

Related Questions in PARTIAL-DIFFERENTIAL-EQUATIONS

Related Questions in VECTOR-ANALYSIS

Related Questions in SUPREMUM-AND-INFIMUM

Related Questions in HAMILTON-JACOBI-EQUATION

Trending Questions

Popular # Hahtags

Popular Questions