I am really struggling to grasp how the Hamiltonian Function and Pontryagin's Maximum Principle work in the context of Optimal Control Theory (Maths for Economics) course. I am given the following conditions:
$$ \quad\space\max_{u} \mathcal{F} = \int_{t_0}^{t_f} f(x(t), u(t), t) \, dt + S(x_f) \tag{1.1} $$
$$ \begin{cases} \begin{array}{l} \dot x(t) = g(x(t),u(t),t) \\ x(t_i)=x_i \\ x(t_f)=x_f \\ \end{array} \end{cases} \qquad \begin{array}{l} \text(constraints) \\ \text(known) \\ \text(unknown) \\ \end{array} \tag{1.2} $$
$$ \begin{cases} \begin{array}{l} \overline{x} = x_1,x_2,\dots,x_n) \\ \overline{u} = u_1,u_2,\dots,u_n) \\ \overline{g} = g_1,g_2,\dots,g_n) \\ \end{array} \end{cases} \qquad\quad \begin{array}{l} \text(state \space variables) \\ \text(control \space variables) \\ \text(constraint \space functions) \\ \end{array} \tag{1.3} $$
So far I understand that we construct the Hamiltonian $\mathcal{H}$ from the Hamiltonian Functional $\mathcal{L}$ and the costate functions $\lambda$ such that:
$$ \mathcal{H}(x,u,t,\lambda) = f(x,u,t) + {\lambda}(g(x,u,t)) \tag{2.1} $$
Then, we find the Hamiltonian equations for the state $x$ and costate $\lambda$ variables:
$$ \begin{cases} \dot{x} = \frac{\partial \mathcal{H}}{\partial \lambda} \\ \dot{x}_i = \frac{\partial \mathcal{H}}{\partial x_i} \quad \text{for} \quad n > 1 \end{cases} \tag{3.1} $$
$$ \begin{cases} \dot{\lambda} = -\frac{\partial \mathcal{H}}{\partial x} \\ \dot{\lambda}_i = -\frac{\partial \mathcal{H}}{\partial x_i} \quad \text{for} \quad n > 1 \lambda_f = \frac{\partial \mathcal{H}}{\partial x_f} S(x_f) \end{cases} \tag{3.2} $$
We are then instructed to solve using Pontryagin's Maximum Principle:
$$ \text{If } m = 1 \begin{cases} \frac{\partial \mathcal{H}}{\partial u} = 0\\ \frac{\partial^2 \mathcal{H}}{\partial u^2} < 0 \\ \end{cases} \tag{4.1} $$
$$ \text{If } m > 1 \begin{cases} \nabla \mathcal{H} = \overline{0} \\ \nabla^2 \mathcal{H} < 0 \quad \text{for} \quad \text{odd } |M_H| \\ \nabla^2 \mathcal{H} > 0 \quad \text{for} \quad \text{even } |M_H| \\ \end{cases} \tag{4.2} $$
Finally, we are told that if we only have the initial conditions $x(t) = x_i$, then:
$$ \begin{cases} x(t_i) = x_i \\ \lambda(t_f) = \frac{\partial \mathcal{H}}{\partial x_f} S(x_f) \quad \text{(transversality condition)} \\ \end{cases} \tag{5.1} $$
While, if we have both initial and boundary conditions, then:
$$ \begin{cases} x(t_i) = x_i \\ x(t_f) = x_f \\ \end{cases} \tag{5.2} $$
I fail to understand the following points:
In the Hamiltonian equations for the state variables 3.1, where did the second equation come from and how did it go from being the derivative of $\mathcal{H}$ with respect to $\lambda$ to being the derivative of $\mathcal{H}$ with respect to $\mathcal{x_i}$? Shoulnd't it derivate with respect to $\lambda$ as well?
In the Hamiltonian equations for the costate variables 3.2, where does $\lambda_f$ come from? When is it used?
In equations 4.1 and 4.2, what does $m$ stand for? I understand from my professors lectures that it means multiplicity, but in a different sense than when used, for example, to talk about eigenvalues in the context of dynamical systems. What, then, is it alluding to?
In equation 5.1, what is the transversality condition, and when/how is it applied? Why is it not necessary when boundary conditions are present, as shown in 5.2?
Overall, I am currently unable to solve this kind of problem. Any help understanding how to approach it would be greatly appreciated. As a quick sidenote, any appropiate criticism on the formatting of this post and my usage of LaTex would also be appreciated.