So suppose I have a part of the lagrangian
$$\mathcal{L}_{\text{part}} = \int_{0}^{T_{\text{max}}} -\dot{\lambda}^T u - \lambda^T F(u)$$
where
$$F(u) = \max(0,\theta_1u+\theta_2)$$
where $\theta_{1,2}$ are time-dependent scalar functions, and $u$ is the state. $u \in \Bbb R^d$, and $F: \Bbb R^d \to \Bbb R^d$. Since $\theta_{1,2}$ are scalars the intepreation of the max function is that it acts component wise on $\theta_1u+\theta_2$.
If I take the perspective that
$$\phi(x ) = \max(0,x)' = \begin{cases}0 \text{ if x} \leq 0 \\ 1 \; \text{otherwise} \end{cases}$$
I'm trying to think about what the weak adjoint equation would look like in this context.
Looking at the variation: $\mathcal{L}_{\text{part},u}[\tilde{u}] = \frac{d}{d\epsilon} \int_{0}^{T_{\text{max}}} -\dot{\lambda}^T (u+ \epsilon \tilde{u}) - \lambda^T F(u+ \epsilon \tilde{u}) \bigg|_{\epsilon=0}$, I'm not exactly sure how to properly do the chain rule on $F$. I think the final weak derivative of $\frac{\partial F}{\partial u}$ must be a matrix somehow for the dimensions of this problem to make sense.
Any suggestions of how to properly do the chain rule/write it down would be extremely helpful.
Thank you