Consider the constrained optimization problem: $min_x f(x)$ s.t. $g(x)=0$. For simplicity, let $f$ and $g$ be scalar functions.
Under suitable conditions, the Lagrange multiplier theorem gives: $\exists \lambda \in \mathbb{R}$ s.t. $\frac{df(x)}{dx} = \lambda \frac{dg(x)}{dx}$.
We can thus introduce the Lagrangian $L(x,\lambda) := f(x) + \lambda g(x)$, and $\nabla L(x,\lambda)=0$ is a way of expressing $\frac{df(x)}{dx} = \lambda \frac{dg(x)}{dx}$ and $g(x)=0$.
So far, so good: we have a computational tool to solve constrained optimization.
Now, consider the similar setting: $\min_{x} f(z(T,x))$ s.t. $\frac{dz(t,x)}{dt} = \mu(z(t,x),t,x)$ with $z(0,x)=y$ (the dependency of $z$ on $y$ is implicit in this notation).
In the derivation of the adjoint system for sensitivity (see https://epubs.siam.org/doi/epdf/10.1137/S1064827501380630 for instance), the function $g(t,x) = \frac{dz(t,x)}{dt} - \mu(z(t,x),t,x)$ is introduced, so that the ODE evolution is expressed by the constraint $g(t,x) = 0$.
The Lagrangian is here $L(x,\lambda) = f(z(T,x)) + \int_0^T \lambda(t) g(x,t) dt$, which is the infinite-dimensional version of the scalar case considered initially with $\lambda$ being now a scalar function.
My confusion stems from the fact that the authors next claim: as $g(x,t)=0$, we have that $\frac{\partial f(z(T,x))}{\partial x} = \frac{\partial L(x,\lambda)}{\partial x}$.
BUT, if the reasoning from the simple constrained optimization problem carries over, we should instead have $\frac{\partial L(x,\lambda)}{\partial x} = \frac{\partial f(z(T,x))}{\partial x} + \frac{\partial}{\partial x} \int_0^T \lambda(t) g(x,t) dt$.
Where is my misunderstanding coming from?
I report below the key extract from the referenced article:
In the paper notation the constraint $F$ defines the ODE evolution, and $p$ is the parameter of interest what we want to optimize, while $G(x,p) = \int_0^T g(x,t,p)dt$ is the function(al) to be optimized.
Based on the result here indicated, the adjoint system evolution can be derived by an application of integration by parts.
