Uniqueness of policy function for optimal control problems

63 Views Asked by At

Let $t \in T \subset \mathbb R_+$ denote time, $x \in X \subset \mathbb R$ the state and $u \in U \subset \mathbb R$ the control. Instantaneous payoffs are $F : X \times U \to \mathbb R$. The discounted stream of payoffs over the entire time interval are then denoted by \begin{align} J(u(s),t) = \int_0^\infty{e^{-rs}F(x(s), u(s))ds} \end{align} with $r > 0$. Further define the value function by \begin{align} &v(x(t)) := \max_{u(s) \in U}J(u(s), t)\\ \text{s.t.}\quad & \dot x(t) = f(x(t),u(t)). \end{align} where $f$ denotes the state equation.

Suppose $v$ is continuously differentiable w.r.t. to $x$. The value is then the solution of the stationary Hamilton-Jacobi-Bellman equation (HJBe) \begin{align} rv(x) = \max_{u \in U}\{F(x,u) + v'(x)f(x,u)\}. \end{align} Suppose that the state dependent policy $\mu : X \to U$ solves the RHS of the HJBe. Note that $\mu$ must not be unique, i.e. there may exists a set of different policy functions which yield the same maximal payoff.

  • Can we assure uniqeness of $\mu$ by conditions on the primitives?

My idea:

We consider open loop control, i.e. our control is a function of time only $u : T \to U$. Then we suppose that the (current value) Hamiltonian \begin{align} H(x,u,\lambda) = F(x,u) + \lambda f(x,u) \end{align} is jointly concave in $(x,u)$ such that the necessary conditions \begin{align} &u^* = \arg\max_{u \in U}H(x,u,\lambda)\\ &\dot \lambda = r\lambda - H_x(x,u,\lambda)\\ &\lim_{t \to \infty}e^{-rt}\lambda(t) = 0 \end{align} of the maximum principle are also sufficient for a global maximum and such that the optimal control trajectory $(u^*(t) : t \in T)$ is unique (no?). We further know that the open-loop control and optimal policy coincide, i.e. $u^*(t) = \mu(x(t))$. So basically my argument is: If $H$ is jointly concave in $(x,u)$, then $\mu$ is unique.