I am recreating results for an algorithm published in this paper, Shutyaev et al (2018). Essentially, I take optimal solutions I have found for a PDE constrained minimisation problem, and use them to find the Hessian of my original cost function $J$ with respect to a parameter control variable $\beta(x)$. If I consider optimal solutions $(\eta(x,t), u(x,t))$, and some appropriate adjoint solution $(\eta^*(x,t), u^*(x,t)$ given the optimal control variable $\beta$ minimising $J$, then the hessian $\mathcal{H}$ of the cost function $J$ acting on some $v:= v(x)$ is defined as the successive solutions of
Tangent Model with initial time conditions \begin{align} \frac{ \partial\hat{\eta}} { \partial t} + \frac { \partial \hat{u}}{ \partial x} + \frac{ \partial \big(u\hat{\eta}\big)}{ \partial x} + \frac {\partial \big(\hat{u} \eta\big)} {\partial x} - \frac {\partial \big(v u\big)} { \partial x} - \frac {\partial \big(\beta \hat{u}\big)} {\partial x} = 0, & \\ \frac {\partial\hat{u}} {\partial t} + \frac {\partial\big(\hat{u} u\big)} { \partial x} + \frac {\partial \hat{\eta}} {\partial x} = 0, & \\ \hat{\eta}(x,0) = 0, \ \hat{u}(x,0) = 0.& \end{align}
Second Order Adjoint system with final time conditions \begin{align} \frac {\partial \bar{\eta}} {\partial t} + \hat{u} \frac {\partial \eta^*} {\partial x} + u \frac { \partial\bar{\eta}} {\partial x} + \frac {\partial \bar{u}}{\partial x} = - \hat{\eta}(x_i,t ;v), & \\ \frac {\partial \bar{u}} {\partial t} + ( 1 + \eta - \beta) \frac {\partial \bar{\eta}} {\partial x} + (\hat{\eta} - v ) \frac {\partial \eta^*} {\partial x} + u \frac {\partial \bar{u}} { \partial x} + \hat{u} \frac {\partial u^*} {\partial x} = 0, & \\ \bar{\eta}(x,T) = 0, \ \bar{u}(x,T) = 0.& \end{align} where $\hat{\eta}(x_i,t ;v)$ is the solution $\hat{\eta}$ given $v$ at a small set of known spatial locations $x_i$. Then the Hessian acting on $v$ is defined as \begin{equation} \mathcal{H}v = \int_0^T \big( \hat{u} \frac{\partial {\eta^*}} {\partial x} + u \frac{\partial \bar{\eta}}{\partial x} \big) dt . \end{equation}
Now, at a later stage of the paper, I am required to solve $\mathcal{H}v = F$ for $v$, where $\mathcal{H}$ is defined as above. $F:= F(x,t)$ is a forcing term I have already calculated, but I am struggling to find the appropriate way to solve $\mathcal{H}v = F$. If I were to use numerical linear algebra methods, I would need to discretize the operator $\mathcal{H}$ such that I could write it as a matrix $H$ multiplied by a vector $v$, and then use it to solve the matrix equation $Hv = F$. I'm struggling with how to discretise this expression in such a way. Alternatively, I tried to formulate this as an optimization problem where I look for the minimiser $v$ of $\parallel \mathcal{H}v - F\parallel_{\infty}$, but given the nonlinearity and nonconvexity of the function I keep hitting local minimums and fail to converge.
Any insights on how to approach this would be extremely helpful!
Assuming you can solve the above PDEs relatively, you don't actually have to compute the matrix $H$, just the action of $H$ on the discretization of $v$, then you can use a Krylov linear solver, which is typical for these times of problems. All you have to do is formulate a function that takes in a discretization of $v$ and returns a discretization of $\mathcal{H}v$. You can then use something like GMRES to solve the linear system using this construction