Approximate a function using least square fit for exponential and constant parameter

88 Views Asked by At

Given an array of $X$ and $Y$ values which describe a function $f$

Such that $\forall i. Y[i] = f( X[i] )$

I would like to approximate $f$ as $a + b \cdot x ^ n $, where $a$, $b$ and $n$ are parameters

For the special case where we assume $a=0$, I know you can use a log trick $log(y) = log(b) + n \cdot log(x)$ which is linear in $log(x)$ and $log(y)$ and can be solved using linear least squares.

However for the case where we don't assume $a$ is zero, I don't know if there exist such a trick.

In case this isn't possible any method which can estimate any of the parameters is welcome

2

There are 2 best solutions below

0
On BEST ANSWER

On purpose, I shall not use $n$ for naming the third parameter.

You have $n$ data points $(x_i,y_i)$ and you want to fit the model $$y=a+b\,x^c$$ which is nonlinear.

But suppose that you give $c$ an arbitrary value. Define $z_i=x_i^c$ and the model is linear which makes that you know $a(c)$ and $b(c)$ solving explicitly the two linear equations $$\sum_{i=1}^n y_i=n \,a + b \sum_{i=1}^n x_i^c$$ $$\sum_{i=1}^n y_i\,x_i^c=a \sum_{i=1}^n x_i^c+ b \sum_{i=1}^n x_i^{2c}$$

Now, define the sum of squares $$\text{SSQ}(c)=\sum_{i=1}^n \Bigg[\Big[a(c)+b(c)\, x_i^c\Big]-y_i \Bigg]^2$$ Run a few values of $c$ until you see more or less a minimum of $\text{SSQ}(c)$. At this point, you have reasonable and consistent estimates of the three parameters and you can launch the nonlinear regression.

If you do not have the appropriate tools, continue with the first step zooming more and more around the minimum of the curve.

Let me make an example with the following data set $$\left( \begin{array}{cc} x & y \\ 1 & 100 \\ 2 & 150 \\ 3 & 300 \\ 4 & 500 \\ 5 & 850 \\ 6 & 1400 \\ 7 & 2200 \\ 8 & 3200 \\ 9 & 4600 \end{array} \right)$$ A first run $$\left( \begin{array}{cc} c & \text{SSQ} & a & b \\ 1.0 & 2.73351\times 10^6 & -1176.39& 530.833 \\ 1.5 & 1.35942\times 10^6& -583.938& 167.092 \\ 2.0 & 543086.& -264.608& 55.0227 \\ 2.5 & 134106.& -54.6183& 18.3401 \\ 3.0 & 4579.42& 98.8254& 6.12868 \\ 3.5 & 61444.& 218.079& 2.04706 \\ 4.0 & 240341.& 314.471& 0.682825 \end{array} \right)$$ So, the minimum is around the triplet $(98.82,6.13,3.0)$. Using the nonlinear regression leads to $R^2=0.999922$, $\text{SSQ}=3050.69$ and $$\begin{array}{clclclclc} \text{} & \text{Estimate} & \text{Standard Error} & \text{Confidence Interval} \\ a & 116.115 & 15.5316 & \{76.189,156.04\} \\ b & 5.30731 & 0.48405 & \{4.0630,6.5516\} \\ c & 3.06564 & 0.04152 & \{2.9589,3.1724\} \\ \end{array}$$

and the predicted values are $$\{121,161,270,488,853,1406,2185,3231,4585\}$$

Notice that using a parabolic interpolation using the values at $c=2.5,3.0,3.5$ gives a minimum of $\text{SSQ}$ for $c=3.097$

3
On

I dare to take the opportunity to cordially greet Claude Leibovici.

I would like to add a comment to his judicious answer (Comment too long to be edited in the comments section).

Thanks to the numerical example provided by Claude Leibovici one can compare with another method (not conventionnal) which advantage is that the method isn't iterative and doesn't requires initial guessed values. For theory see https://fr.scribd.com/doc/14674814/Regressions-et-equations-integrales . Since this paper is written in French a translation of the algorithm is given below.

enter image description here

The calculus is direct and very simple :

enter image description here

On the graph the curves y(X) drawn in blue from the above calculus and in black from Claude Leibovici's calculus are quite indistinguishable.

Least Mean Square Error :

From above calculus : 21.6 ; From Leibovici's calculus : 18.4

Least Mean Square Relative Error :

From above calculus : 0.0047 ; From Leibovici's calculus : 0.0083

This is not surprising since the criteria of fitting in Leibovici's calculus is LMSE.