Differential equations: multiple shooting methods for parameter estimation on a system of delay differential equations

107 Views Asked by At

I am working on a system of delay differential equations with constant lags. Right now I am just working on getting the code pipeline built, so the model is still very simple, but I will build out the full system soon enough. I also have some time series data, and I want to fit the parameters of the model to that data. So let us define the system as:

$$ \Delta x_{n+1} = f(x_{n-\tau}, \theta)\\ \Delta y_{n+1} = f(y_{n-2\tau}, \eta) $$

where $x, y$ are variables representing the state of the system, $n-\tau, n-2\tau$ are fixed time lags, and $\theta, \eta$ are vectors of parameters.

My goal is to use multiple shooting to fit the parameters $\theta, \eta$ on smaller sequential subsets of the full time series data. Hence if I have a set of 30 datapoints, I might break the full data into groups of 3 or 4 points, and optimize the parameter values such that the loss is minimized over all groups. The Loss function for multiple shooting is just the usual squared error loss, plus a smoothness/continuity penalty to ensure that the predicted value of the last point in each group is as close as possible to the first datapoint in the subsequent group.

The question is, what is the algorithm for applying multiple shooting to delay differential equations? In a simple ODE with no delays, the solver will treat the first observation in each group as the initial value, and integrate forward till the end of the group. Then the loss is computed between the data and the predicted values and the parameters are updated to reduce that loss.

However, when solving a delay differential equation, the solver will track a "history" function, that monitors the state variables at each fixed delay. So if I have a group size of 4, and a lag value of 2, then the history function is usually initialized to some value for timesteps -2.0:0, and then the DDE solver will start to use the actual computed state from timestep 0.0:T.

This is where things start to get confusing. So in a multiple shooting scenario, the solver must track the history function for each group. But since the history happens before the initial observation for each group, this creates a dependency between groups. For example if the delay is fixed at 2 and the group size is 4, that implies that the first observation in group 3 is dependent on the 3rd observation in group 2.

So I was just trying to understand the correct way to set up multiple shooting on a delay differential equation system. Would the history function of each group be the trajectory computed for the previous group? Is that a numerically stable way to do it, or will that approach generate singularities and non-unique solutions, etc.