Fitting Data to Model Equation

185 Views Asked by At

I am attempting to fit an equation which models how fast a chemical reaction progresses. Given the following equation:

$$\ln(k)= \ln\biggl(\frac{d\,k_0\,M}{1+P_r}\biggl)+\ln\biggl(\frac{a\exp(-b/T)+\exp(-T/c)}{1+s\ln^2(P_r)}\biggl)+e\,\ln T$$

where $$k_0=f\,T^g \exp\biggl(\frac{-h}{R\,T}\biggl)$$ and $$P_r=M\,\frac{f}{i}\,T^{g-j}\exp⁡\biggl(\frac{k-h}{R\,T}\biggl)$$

The variables I would like to fit are $a-k$. $M$ is a function determined outside of this regression; $s$ is a non-fit constant. I already have the derivatives and can perform a standard curve fit/least squares fit; however, I don't have a method of finding a good initial guess. Since this will be in an optimization scheme, the values of $\ln(k)$ will vary and the prior guess will not always be sufficient. I've found similar posts with answers by @JJacquelin, but I'm uncertain how to implement this method with my equation. An English translation to the scribd papers that JJacquelin refers to can be found here.

Sample data is given as \begin{array}{|c|c|c|} \hline T& M & \ln(k) \\ \hline 1513.8 & 7.95\times10^{-9} & -1.0378 \\ \hline 1889.7 & 6.36\times10^{-9} & 5.3839 \\ \hline 2513.8 & 4.78\times10^{-9} & 9.4501 \\ \hline 1513.8 & 7.95\times10^{2} & -0.6490 \\ \hline 1889.7 & 6.36\times10^{2} & 6.7692 \\ \hline 2513.8 & 4.78\times10^{2} & 13.6235 \\ \hline 1513.8 & 5.30\times10^{-3} & -0.7416 \\ \hline 1763.8 & 4.55\times10^{-3} & 4.2598 \\ \hline 2013.8 & 3.98\times10^{-3} & 7.5729 \\ \hline 2263.8 & 3.54\times10^{-3} & 9.8066 \\ \hline 2513.8 & 3.19\times10^{-3} & 11.3176 \\ \hline \end{array}

and $s=0.1886$

Edit: A few additional notes.

  1. Fitting $a-k$ may have been too ambitious and unnecessary. What if instead I only fit $a-e$?
  2. The number of data points being kept small is intentional. I use the minimum number of $\ln k$ to define the constants; however, I think I can have as many as 11 values define 5 variables if I instead only fit $a-e$.
  3. The data that I am working with is numerical. It has machine precision and does not have uncertainty.
1

There are 1 best solutions below

1
On

For this sort of regression problem, one of the standard algorithms for non-linear least squares is the Levenberg-Marquardt algorithm. As far as the initial guess, you could just use the regressed parameters of a similar reaction system or the last set of parameters.

In your case, with models that are somewhat parsimonious you can add regularization to the parameters to tame some of the problems with it, I think nonlinear ridge regression would help in this case. However, with such a nonlinear model with datapoints = the number of regressed variables you will have many cases of regressed models just being nonsense.

As a side note from someone who has regressed reaction parameters from data, you can't really trust them more than $\approx \pm 10\%$. Unless you are running this as a simulation, an appropriate error bar on this data set would be in the neighborhood of $1\%$ relative error for every reading (for reasonable well-maintained gauges and sensors).