Let $I=[a,b]\subset \mathbb{R}, G:\mathbb{R}^n\to \mathbb{R}^k$ smooth, $0<k<n, M=G^{-1}(0)$. Assume that $DG(x)$ has full rank for all $x\in M$. Fix $p_1,p_2\in M$ and assume $u\in W^{1,\infty}(I,\mathbb{R}^n)$ minimizes the functional given by
$$J(u)=\int_a^bF(t,u(t),\dot{u}(t))dt $$
on the set $$S := \{u\in W^{1,\infty}(I,\mathbb{R}^n): u(a)=p_1,u(b)=p_2, u(t)\in M\}.$$
Show there exists a function $\lambda \in W^{1,\infty}(I,\mathbb{R}^k)$ such that
$$\frac{d}{dt}F_p(t,u(t),\dot{u}(t)) -F_u(t,u(t),\dot{u}(t))= DG(u(t))^T\lambda(t).$$
I got as $\textbf{hint}$:
Near $p\in M$, there are parameterizations $\psi:V\to U\subset M$ where $V\subset \mathbb{R}^{n-k}$ and $U\subset M$ contains $p$. Assume first $\bigcup_t\{u(t)\}\subset U$. Define $w:I\to\mathbb{R}^{n-k}$ by
$$w(t) = (\psi^{-1}\circ u)(t) $$ and find a suitable functional $\tilde{J}$ (on a suitable space) which corresponds to $J$ and is minimized by $w$. Use the Euler-Lagrange equations for $\tilde{J}$ and the fact that $DG(\psi(z))D_z\psi(z)=0.$
(for the general case, cover $\bigcup_t\{u(t)\}$ with coordinate patches and localize by subdividing the set into pieces that lie within thise patches).
I'm simply trying to prove the simpler case, but I have hard time finding such $\tilde{J}$. I appreciate all the help and suggestions.
It might be useful for you to see how to derive Lagrange multipliers in the finite dimensional setting and then generalize it to the variational setting.
Let's work with a curve in $\mathbb R^3$ for concreteness. Assume the curve is given implicitly by the constraint $G(\vec x) = 0$ where $G: \mathbb R^3 \to \mathbb R^2$ (and the rank of $G$ is 2 along the curve). Let's assume now that the curve can be parametrized as $\psi: \mathbb R \to \mathbb R^3, \psi: x \mapsto (x, g(x), h(x))$ in these coordinates (such coordinates can always be found because of the maximal rank assumption by the implicit function theorem). Then we have that $G(\psi(x)) = G(x, g(x), h(x)) = 0$ for all $x$ and therefore $D(G \circ \psi)^i = G^i_x + G^i_y g' + G^i_z h' = 0$ for $i = 1,2$ (the argument of $G^i$ is $\psi(x)$ but I suppress it for notational convenience).
Now let's get back to finding extremes of the function $F : \mathbb R^3 \to \mathbb R$ along this curve. This is equivalent to finding extremes of $F(\psi(x))$ on $\mathbb R$. But the condition for the extremes is $D(F\circ \psi) = F_x + F_y g' + G_z h' = 0$. We note that this is very similar for the equation of constraints given by the implicit function theorem and therefore it's enough to solve $\nabla F = \lambda_1 \nabla G^1 + \lambda_2 \nabla G^2$ for two unknown constants $\lambda_1, \lambda_2$. We can suggestively rewrite this equation as $DF = DG ^T \lambda$ with $\lambda = (\lambda_1, \lambda_2) \in \mathbb R^2$.
Now, returning to the variational setting, everything works out very similarly except that you replace $D$s above with $\delta$s. Applying standard variational arguments you can then reduce integral equations to time-local equation for every $t$ and deduce the existence of a constant $\lambda(t)$. The function $\lambda$ will be in $W^{1,\infty}(I, \mathbb R^k)$ precisely because it must satisfy the Euler-Lagrange equation and all the other functions figuring there ($F\circ u$, $G\circ u$ and their derivatives) belong to $W^{1,\infty}$ spaces.