This question is about a naive approach to non-linear hyperbolic systems, thinking in the context of elasticity.
To set up the problem suppose $\Omega\subset \mathbb{R}^n$ is open and bounded. There is a stored energy functional $W$, which is a $C^2$ function of $n\times n$ matrices such that $W$ is strictly convex (incorrect for elasticity but might simplify things) and bounded/coercive i.e. for some $C>0$, $p\geq 2$, $C^{-1}|\mathbf{F}|^p\leq W(\mathbf{F})\leq C(\mathbf{F}^p+1)$ and $|\frac{\partial W(\mathbf{F})}{\partial \mathbf{F}}|\leq C(|\mathbf{F}|^{p-1}+1)$. This determines the potential energy of a deformation, $\mathbf{f}\in W^{1,p}_0(\Omega)^n$ as $$ V(\mathbf{f})=\int_\Omega W(\nabla \mathbf{f})dV $$ and suppose the kinetic energy is given, for $\dot{\mathbf{f}}\in L^2(\Omega)$, by $$ T(\dot{\mathbf{f}})=\int_\Omega \frac{\rho}{2} |\dot{\mathbf{f}}|^2dV. $$
Now I love the PDE book by Evans, and following his treatment of linear second-order hyperbolic equations suggest to me that we should work in a space like $\mathbf{f}\in L^p(0,T;W_0^{1,p}(\Omega)^n)$ with a weak time derivative $\dot{\mathbf{f}}\in L^2(0,T;L^2(\Omega)^n)$, and then there is a momentum $\mathbf{p}=\rho \dot{\mathbf{f}}$ that has $\mathbf{p}\in L^2(0,T;L^2(\Omega)^n)$ and a weak derivative $\dot{\mathbf{p}}\in L^q(0,T,W^{-1,q}(\Omega)^n)$. The Hamiltonian would be the total energy $$ H(\mathbf{f},\mathbf{p})= \int_\Omega \left[W(\nabla \mathbf{f})+\frac{1}{2\rho}|\mathbf{p}|^2\right]dV $$ and the dynamic equations are \begin{align} \langle\rho\ddot{\mathbf{f}},\mathbf{w}\rangle=\langle \dot{\mathbf{p}},\mathbf{w}\rangle=-\int_\Omega \nabla \mathbf{w}\cdot \frac{\partial W}{\partial \mathbf{F}}(\nabla \mathbf{f}) dV. \end{align} This should be satisfied for almost every $t$ and all $\mathbf{w}\in W^{1,p}_0(\Omega)^n$. Initial conditions might be posed for arbitrary $\mathbf{f}_0\in W^{1,p}_0(\Omega)^n$ and $\dot{\mathbf{f}}_0\in L^2(\Omega)^n$.
On the one hand it seems that the Galerkin method should still work to construct solutions as conservation of energy implies uniform boundedness of the norms for $\mathbf{f}$ and $\dot{\mathbf{f}}$. On the other hand it seems absurd to have a velocity field in $L^2(\Omega)^n$ and expect the deformation to live in $W^{1,p}_0(\Omega)^n$.
So does anyone know if this is a reasonable way to go about the non-linear theory? Are there problems that arise having the mismatch of exponents in these time-dependent spaces? I hope the problem makes sense and it did not waste your time to read it. Thanks!