I have to fit data to a system of two differential equations in order to estimate parameter values. The model is on pathogen and cell interaction.
The ODE system is:
${dA\over dt}=r_1 A-p A(1-{B\over k_2})+eB$
${dB \over dt}=r_2 B + p A(1-{B\over k_2}) -e B$
Here A are the pathogens that are outside of cells, B are the pathogens that are within cells.
I want to estimate the values of $r_2$ (replication rate of pathogen inside the cell), e (exit rate of pathogen from the cell) and p (rate in which pathogens are internalized within cells). The other parameters are known constants.
To estimate these, I have found some data from an article. They have measured the pathogen load exiting from the cells (from which I can find the exit rate ‘e’) and they have also plotted the entire pathogen population over time continuously as well (which includes pathogen replication, internalization, exit all).
The data that have measured exit are given as the total that exited over a time period of 2 hours. So for example data are like from 10-12 hours $10^3$ exited, 12-14 hours $10^5$ exited likewise.
So, when fitting these data to estimate the parameters I want to estimate ‘e’ from the data that shows the number of pathogen that exited over two hour time periods and the other parameters $r_2,p$ from the other data which shows the pathogen load over time.
I can’t first fit only to the exited data because then I don’t know $r_2$ and p.
So, how can I fit the ODE system for the two data sets simultaneously to estimate the three parameters.
In Matlab I know how to fit an ODE system to estimate parameters based on single data set, but I am finding it difficult to think how I can implement this kind of scenario.
${dA\over dt}=r_1 A-p A(1-{B\over k_2})+eB$
${dB \over dt}=r_2 B + p A(1-{B\over k_2}) -e B$
Sum of the two equations :
${d(A+B)\over dt}=r_1 A+r_2B$
$$y(t)=\frac{d(A+B)}{ dt}-r_1 A=r_2B$$
${d(A+B)\over dt}$ can be evaluated from numerical differentiation (see below).
So, there is only one unknown $r_2$ to be evaluated thanks to linear regression.
If the data consists in : $\quad\begin{cases} A_1,A_2,...,A_k,...,A_n \\ B_1,B_2,...,B_k,...,B_n \\t_1,t_2,...,t_k,...,t_n\end{cases}$ $$\left(\frac{d(A+B)}{ dt}\right)_k \simeq \frac{A_{k+1}+B_{k+1}-A_{k+1}-B_{k-1} }{t_{k+1}-t_{k-1}}$$
For $\quad k=2\quad\text{to}\quad k=n-1\quad$ compute $\quad y_k=\frac{A_{k+1}+B_{k+1}-A_{k+1}-B_{k-1} }{t_{k+1}-t_{k-1}}-r_1A_k$
Least square fitting to $\quad y(t)=r_2B(t)$ : $$r_2\simeq \frac{\sum_{k=2}^{n-1}y_kB_k}{\sum_{k=2}^{n-1}(B_k)^2}$$
NOTE :
The numerical computation of derivatives is subjected to important deviation if the scatter of the data is large. So, it is not the best method.
The numerical integration is much more robust.
$$Y(t)=A+B-r_1\int Adt=r_2\int B dt$$ Starting from $I_1=0$, compute $$\quad I_k=I_{k-1}+\frac12 (B_k+B_{k+1})(t_{k+1}-t_k)\quad\text{from}\quad k=2\quad\text{to}\quad k=n$$
Starting from $J_1=0$, compute $$\quad J_k=J_{k-1}+\frac12 (B_k+B_{k+1})(t_{k+1}-t_k)\quad\text{from}\quad k=2\quad\text{to}\quad k=n$$
Then compute : $$Y_k=A_k+B_k-r_1J_k$$ Least square fitting to $\quad Y(t)=r_2B(t)$ : $$r_2\simeq \frac{\sum_{k=2}^{n-1}Y_kB_k}{\sum_{k=2}^{n-1}(B_k)^2}$$
Coming back to :
$p (1-{B\over k_2})A-eB=-{dA\over dt}+r_1 A$
$p (1-{B\over k_2})A -e B={dB \over dt}-r_2 B$
where $r_1,r_2, k_2$ are known and the unknowns are $p$ and $e$.
Proceed the mean least square regression from the system of $2n$ equations (because they are $n$ equations for each one of the two above symbolic equations).
The derivatives can be approximated by numerical computation as it was done above, or alternatively the integral computation in interest of robustness:
$p \int (1-{B\over k_2})Adt-e\int Bdt=-A+r_1 \int Adt$
$p \int (1-{B\over k_2})Adt -e \int Bdt=B-r_2 \int Bdt$