Suppose we have the equation:
$$y(t_i) = C_0 + C_1 e^{-\lambda_1 t_i} + C_2 e^{\lambda_2 t_i}$$
where $C_0$, $C_1$, $\lambda_1$, $C_2$, and $\lambda_2 \ge 0$
This is equivalent to the summation of an exponential decay curve with that of exponential growth.
Thus, one can see that
- $C_0$ is equivalent to the sum of the asymptotic value for the decay part as $t \rightarrow \infty$ and the asymptotic values for the gowth part as $t \rightarrow -\infty$
- $y(0) = C_0 + C_1 + C_2$
If I have a sample of n observations, what are some good starting approximations for the estimates of $C_1$, $\lambda_1$, $C_2$, and $\lambda_2$ which I can then pass into a non-linear optimization algorithm. I've explored ODE solutions to these types of equations, but cannot find a good "least squares" type function of the observed data.
I am considering partitioning the observations close to the position of the inflection point in the data, and estimating the $\lambda_1, \lambda_2$ from the least squares solution to the log transform of the data points to the left and the right, and then solving for $C_0, C_1,$ and $C_2$ after substituting $\hat{\lambda_1}$ and $\hat{\lambda_2}$ into the above equation.
Any alternative solutions would be greatly appreciated.
UPDATE:
@JJacquelin gave an excellent answer. I worked out the solution myself to confirm that the equations hold.
Here is an illustration:
However, there is at least one instances where the approach breaks down.
- The expression for $p \ or \ q = \frac{B \pm \sqrt{B^2 + 4A}}{2}$ can't be calculated when $B^2 + 4A < 0$.





They are a lot of variant methods to solve this kind of regression problem. They involves iterative numerical calculus, starting from "guessed" values of the parameters to be evaluated. The question raised by the OP is how to find sufficiently correct initial values. Not well guesses are often a main cause of low robustness.
There is a non-traditional method to overcome this difficulty of "guesses". This method isn't iterative and don't require initial values. The general principle is described in the paper : https://fr.scribd.com/doc/14674814/Regressions-et-equations-integrales .
The case of fitting a two exponential function is shown pages 71-72 and a numerical example is given p.73. But the function considered is : $$y(x)=b\:e^{p\:x}+ c\:e^{q\:x}$$ They are four parameters $p,q,b,c$. This is slightly different from the problem raised by the OP with five parameters. This requires to replace the 4X4 matrix by a 5X5 matrix. The updated process is shown in the sheet below. In order to be consistent with the symbols used in the referenced paper, the function considered is : $$y(x)=a+b\:e^{p\:x}+ c\:e^{q\:x}$$ The five parameters to compute are $p,q,a,b,c$ .
A numerical example is joint below :
In this method, the linearization of the regression is obtained thanks to an integral equation. This requires some numerical integrations which correspond to the calculus of $S_k$ and $SS_k$. That is a key point: If the numerical integration was exact, the final result would be perfect. But the numerical integration is not exact. That is a cause of deviation especially if they are few points and if the points are not well distributed. As a consequence the values of $p,q,a,b,c$ obtained are approximates generally not far from to correct values (if the numerical integrations are accurate enough, which requires some conditions for the data, especially a number of points large enough).
If the overall result is not accurate enough, one cannot avoid to use a common iterative process in order to improve the result. But the main advantage is that no guessed values are necessary : As initial values, one can use the approximates already found thanks to the above method. This makes robust the whole process.
FOR INFORMATION :
In addition to the theory given in the referenced paper, the integral equation used here is : $$y(x)=-pq\int\int y\:dx\:dx+(p+q)\int y\:dx+C\:x^2+D\:x+E$$ The parameters $a,b,c$ and the limit of the defined integrals appear on complicated manner in the coefficients $C,D,E$. It is much simpler to do a linear regression for $a,b,c$ than trying to derive them from $C,D,E$. That is what is finally done with the 3X3 matrix.