How can I apply the Hamiltonian function and Pontryagin's maximum principle in the context of Optimal Control Theory?

30 Views Asked by At

I am really struggling to grasp how the Hamiltonian Function and Pontryagin's Maximum Principle work in the context of Optimal Control Theory (Maths for Economics) course. I am given the following conditions:

$$ \quad\space\max_{u} \mathcal{F} = \int_{t_0}^{t_f} f(x(t), u(t), t) \, dt + S(x_f) \tag{1.1} $$

$$ \begin{cases} \begin{array}{l} \dot x(t) = g(x(t),u(t),t) \\ x(t_i)=x_i \\ x(t_f)=x_f \\ \end{array} \end{cases} \qquad \begin{array}{l} \text(constraints) \\ \text(known) \\ \text(unknown) \\ \end{array} \tag{1.2} $$

$$ \begin{cases} \begin{array}{l} \overline{x} = x_1,x_2,\dots,x_n) \\ \overline{u} = u_1,u_2,\dots,u_n) \\ \overline{g} = g_1,g_2,\dots,g_n) \\ \end{array} \end{cases} \qquad\quad \begin{array}{l} \text(state \space variables) \\ \text(control \space variables) \\ \text(constraint \space functions) \\ \end{array} \tag{1.3} $$

So far I understand that we construct the Hamiltonian $\mathcal{H}$ from the Hamiltonian Functional $\mathcal{L}$ and the costate functions $\lambda$ such that:

$$ \mathcal{H}(x,u,t,\lambda) = f(x,u,t) + {\lambda}(g(x,u,t)) \tag{2.1} $$

Then, we find the Hamiltonian equations for the state $x$ and costate $\lambda$ variables:

$$ \begin{cases} \dot{x} = \frac{\partial \mathcal{H}}{\partial \lambda} \\ \dot{x}_i = \frac{\partial \mathcal{H}}{\partial x_i} \quad \text{for} \quad n > 1 \end{cases} \tag{3.1} $$

$$ \begin{cases} \dot{\lambda} = -\frac{\partial \mathcal{H}}{\partial x} \\ \dot{\lambda}_i = -\frac{\partial \mathcal{H}}{\partial x_i} \quad \text{for} \quad n > 1 \lambda_f = \frac{\partial \mathcal{H}}{\partial x_f} S(x_f) \end{cases} \tag{3.2} $$

We are then instructed to solve using Pontryagin's Maximum Principle:

$$ \text{If } m = 1 \begin{cases} \frac{\partial \mathcal{H}}{\partial u} = 0\\ \frac{\partial^2 \mathcal{H}}{\partial u^2} < 0 \\ \end{cases} \tag{4.1} $$

$$ \text{If } m > 1 \begin{cases} \nabla \mathcal{H} = \overline{0} \\ \nabla^2 \mathcal{H} < 0 \quad \text{for} \quad \text{odd } |M_H| \\ \nabla^2 \mathcal{H} > 0 \quad \text{for} \quad \text{even } |M_H| \\ \end{cases} \tag{4.2} $$

Finally, we are told that if we only have the initial conditions $x(t) = x_i$, then:

$$ \begin{cases} x(t_i) = x_i \\ \lambda(t_f) = \frac{\partial \mathcal{H}}{\partial x_f} S(x_f) \quad \text{(transversality condition)} \\ \end{cases} \tag{5.1} $$

While, if we have both initial and boundary conditions, then:

$$ \begin{cases} x(t_i) = x_i \\ x(t_f) = x_f \\ \end{cases} \tag{5.2} $$

I fail to understand the following points:

  • In the Hamiltonian equations for the state variables 3.1, where did the second equation come from and how did it go from being the derivative of $\mathcal{H}$ with respect to $\lambda$ to being the derivative of $\mathcal{H}$ with respect to $\mathcal{x_i}$? Shoulnd't it derivate with respect to $\lambda$ as well?

  • In the Hamiltonian equations for the costate variables 3.2, where does $\lambda_f$ come from? When is it used?

  • In equations 4.1 and 4.2, what does $m$ stand for? I understand from my professors lectures that it means multiplicity, but in a different sense than when used, for example, to talk about eigenvalues in the context of dynamical systems. What, then, is it alluding to?

  • In equation 5.1, what is the transversality condition, and when/how is it applied? Why is it not necessary when boundary conditions are present, as shown in 5.2?

Overall, I am currently unable to solve this kind of problem. Any help understanding how to approach it would be greatly appreciated. As a quick sidenote, any appropiate criticism on the formatting of this post and my usage of LaTex would also be appreciated.