So I'm slightly confused about the gradient.
How is existence of the gradient related to linearity?
The reason for asking this is that I read a paper in shape optimization that wrote:
If the derivative is linear with regards to vector field $V$, then there exists gradient $\nabla J$ and $$dJ(\Omega; V) = \langle \nabla J , V\rangle_{\partial \Omega}=\int_{\partial \Omega} \nabla J(s) V(s) ds$$
Why the assumption of linearity?
Given a Banach space $X$ over a field $\mathbb{K}$ and a function $f : X \to \mathbb{K}$, its derivative $df(x,v)$ at a point $x \in X$ in the direction of $v \in X$ is defined as $$ df : X \times X \to \mathbb{K}, \, (x,v) \mapsto \lim_{t \to 0} \frac{f(x + tv) - f(x)}{t}. $$ If this derivative is linear in $v$, we may define the gradient as a family of functionals $\nabla f(x) \in X^*$ parametrised by $x$ through $$ \langle \nabla f(x), v \rangle = df(x,v). $$ The point now is that a-priori it is not clear whether $df$ is linear in $v$ and indeed this assumption is not satisfied for some functions. In this case, the gradient does not exist, even though a derivative does.
See also the difference between Fréchet derivative and Gâteaux derivative.