Help deriving linear regression algorithm for fitting triexponential function

148 Views Asked by At

I'm trying to derive a method for determining parameters for the following formula using the method of integral equations as discussed in the accepted answer to this question:

Fit sum of exponentials

$$ y=a+be^{px}+ce^{qx}+de^{rx} $$

I've tried to rewrite the above equation sans the constant without using exponentials by substituting in the first, second, and third integrals: $S$, $SS$, $SSS$.

$$ y'=A (SSS)+B(-SS)+C(S)+D x^2+E x + F $$

I've then tried to regress the data on this equation to determine the coefficients, which get plugged into the result of solving the following set of nonlinear equations:

$A=p q r$, $B=qr+pr+pq$, $C=p+q+r$.

The resulting analytical solutions were very large, but when simplifying with example data, I discovered they always included an imaginary number. Can you suggest any pointers towards what I might have done wrong? Upon reflecting on my words here, I thought maybe I should not have left out the $a$ when integrating to find $S$, $SS$, and $SSS$. While that does change the regression coefficients, it doesn't seem to change the $p, q, r$ finding equation, nor it's problem of producing imaginary numbers.

This new regression equation (with $a$ constant) is:

$$ y = A(SSS)+B(-SS)+C(S)+D x^3+E x^2 + F x + G $$

1

There are 1 best solutions below

5
On BEST ANSWER

$$ y=a+be^{px}+ce^{qx}+de^{rx}\tag 1 $$ You correctly find the associated integral equation : $$ y = A(SSS)+B(-SS)+C(S)+D x^3+E x^2 + F x + G $$ So, this leads to a linear regression for seven unknowns $A,B,C,D,E,F,G$.

Only $A,B,C$ are used in order to compute $p,q,r$ from the cubic equation issued from the system : $$\begin{cases}A=p q r\\ B=qr+pr+pq \\C=p+q+r \end{cases}$$ Then, one put the approximates $p,q,r$ into Eq.$(1)$. Finally, the approximates of $a,b,c,d$ are computed thanks to a four parameters linear regression.

But all the above is purely theoretical and is exact if there is no scatter on the data and no deviation in the successive numerical integrations and no deviation in the big matrix calculus.

Seven parameters are much for a regression, requiring a large data to be accurate. An aggravating factor comes from the exponential functions which leads to a combination of small and big numerical values, leading to deviations in the matrix numerical calculus.

In practical calculus, if the scatter on initial data is too large and/or if the number of experimental points is too low (even without scatter), the successive numerical integrations introduces some deviations which makes the computation fail (complex roots instead of real for example).

It is recommended to do preliminary tests of the computer code with exact simulated data (no scatter) and with a large number of points fairly close one from the others. Then try with true experimental data.

NOTE :

When, on a limited range, the three exponentials are close one two the other (difficult to distinguish from each other), in some cases this situation might be confused with a unique "mean" exponential and a small superimposed sinusoid, that is on the form : $$y\simeq a+be^{px}+c\sin(\alpha x)+d\cos(\beta x)$$