According to Pontryagin principal of minimality the optimal control minimizes the Hamiltonian along the optimal trajectory.
A Hamiltionan is given by $\mathcal{H}=p^Tf + \mathcal{L}$ for costate vector $p$, Lagranian $\mathcal{L}$ and dynamics $f$.
We also have that the costate and the optimal trajectory should satisfy the canonical equations, $p^{'}=-\mathcal{H}_{x}$ and $x^{'}=\mathcal{H}_{p}$
All the above should apply while evaluation the Hamiltonian along the optimal trajectory i.e the one produced by the optimal control.
To the question,
How are we suppose to find a minima given that both the state and the costate ultimately depends on $u$? What do we plug in for $p$ and $x$ in the Hamiltonian when minimizing over $u$?
I find this hard to consider since both $p$ and $x$ might depends on $u$.
Let us consider the following objective function
$$J = \varphi(\boldsymbol{x}(t_\text{f}),t_\text{f}) + \int_{t_0}^{t_\text{f}}\mathcal{L}(\boldsymbol{x}(t),\boldsymbol{u}(t),t)dt,$$
in which $\varphi(\boldsymbol{x}(t_\text{f}),t_\text{f})$ is the terminal cost depending on the final state $\boldsymbol{x}(t_\text{f})$ at final time $t_\text{f}$ and $\mathcal{L}(\boldsymbol{x}(t),\boldsymbol{u}(t),t)$ denoting the Lagranian which is also a function of the control input vector $\boldsymbol{u}(t)$.
If we want to determine the control input $\boldsymbol{u}(t)$ that minimizes $J$ subject to the dynamics of the system
$$\dot{\boldsymbol{x}}(t)=\boldsymbol{f}(\boldsymbol{x}(t),\boldsymbol{u}(t),t)$$
with free initial condition $\boldsymbol{x}(t_0)$ and final condition $\boldsymbol{x}(t_\text{f})$, then we need to apply the following steps (taken from the book Optimal Control Systems (2003) by Desineni Subbaram Naidu, page 69).
Step 1: Determine the Hamiltonian
$$\mathcal{H}(\boldsymbol{x}(t),\boldsymbol{u}(t),\boldsymbol{\lambda}(t),t)=\mathcal{L}(\boldsymbol{x}(t),\boldsymbol{u}(t),t)+\boldsymbol{\lambda}^T(t)\boldsymbol{f}(\boldsymbol{x}(t),\boldsymbol{u}(t),t).$$
Step 2: Determine the optimal control input $\boldsymbol{u}^*(t)=\boldsymbol{k}(\boldsymbol{x}^*(t),\boldsymbol{\lambda}^*(t),t)$ by minimize $\mathcal{H}$ with respect to $\boldsymbol{u}(t)$. $$\boldsymbol{u}^*(t)={\displaystyle {\underset {\boldsymbol{u}(t)}{\operatorname {arg\,min} }}} \,\,\mathcal{H}(\boldsymbol{x}(t),\boldsymbol{u}(t),\boldsymbol{\lambda}(t),t)$$ Note, this condition can be simplifed to $$\dfrac{\partial}{\partial \boldsymbol{u}(t)}\mathcal{H}(\boldsymbol{x}(t),\boldsymbol{u}(t),\boldsymbol{\lambda}(t),t)=0$$ for unconstraied optimization.
Step 3: Determine the optimal Hamiltonian by using the optimal control $\boldsymbol{u}^*(t)$.
$$\mathcal{H}^*(\boldsymbol{x}^*(t),\boldsymbol{u}^*(t),\boldsymbol{\lambda}^*(t),t)$$
Step 4: Solve the state and costate equation
$$\dot{\boldsymbol{x}}^*(t)=+\dfrac{\partial \mathcal{H}^*}{\partial \boldsymbol{\lambda}^*}$$ $$\dot{\boldsymbol{\lambda}}^*(t)=-\dfrac{\partial \mathcal{H}^*}{\partial \boldsymbol{x}^*}$$
with the initial condition $\boldsymbol{x}_0$ and the final conditions
$$\left[\mathcal{H}^*+\dfrac{\partial \varphi}{\partial t} \right]^{}_{t_\text{f}}\delta t_\text{f}+\left[ \left(\dfrac{\partial \varphi}{\partial \boldsymbol{x}}\right)^*-\boldsymbol{\lambda}^*(t)\right]^T_{t_\text{f}}\delta \boldsymbol{x}_\text{f}=0.$$
In the last equation $\delta t_\text{f}$ and $\delta \boldsymbol{x}_\text{f}$ are variations which vanish if the final time or the final state is fixed, hence does not permit any variation.
Step 5: Determine the optimal control $\boldsymbol{u}^*(t)$ by substituting $\boldsymbol{x}^*(t)$ and $\boldsymbol{\lambda}^*(t)$ into
$$\boldsymbol{u}^*(t)=\boldsymbol{k}(\boldsymbol{x}^*(t),\boldsymbol{\lambda}^*(t),t).$$