Is there a general way to determine the best combination of parameters to fit points?

131 Views Asked by At

If the number of the given points is greater than or equal to the number of the parameters in the model, is it always possible to determine those parameters?

See my previous problem, Claude Leibovici answered it nicely, it worked!

But say $y=ax+bx^2+\frac{c}{x}+\frac{\sin(dx)}{x^2}$ and the number of the given points is greater than or equal to $4$ (which is the number of the parameters $a,b,c,d$), say we have $9$ points. How to determine those parameters for best fit (with least-squares)??

Not necessarily $y=ax+bx^2+\frac{c}{x}+\frac{\sin(dx)}{x^2}$, but say we have:

$y=f(a_1,a_2,a_3,\dots,a_n,x)$ (which means that $y$ is to be expressed in terms of the parameters $a_1,a_2,a_3,\dots,a_n$, and $x$, and we have $n$ or more known points, how can we find those $n$ parameters for best fit (with least-squares)?

This plot is an example only:

enter image description here

There are $9$ points and $4$ parameters, which I think it can be done (even numerically).


Any help to understand if there is a general technique/method?

Your help would be really appreciated. THANKS!

2

There are 2 best solutions below

1
On

General model fitting by least squares requires a nonlinear minimization algorithm, which will find the parameters that minimize the SSD fitting error

$$\epsilon(a,b,c,\cdots)=\sum_{k=1}^n(y_k-f(x_k;a,b,c,\cdots))^2$$ where $f$ is the parametric model.

The standard algorithm for this problem is by Levenberg & Marquardt. It requires the Jacobian matrix of the function. https://en.wikipedia.org/wiki/Levenberg-marquardt_algorithm


When the model is linear in some of the parameters (in your case it is linear in $a,b,c$), you can consider the auxiliary error function obtained by setting values of the "nonlinear" parameters, then fitting the resulting linear model and using the fitting residual.

In your case,

$$\epsilon(d)=\sum_{k=1}^n\left(z_k-\hat a(d)x_k-\hat b(d) x_k^2-\frac{\hat c(d)}{x_k}\right)^2$$ where $\hat a,\hat b,\hat c$ are obtained by linear least-squares fitting of $z_k:=y_k-\dfrac{\sin(dx_k)}{x_k^2}$.

Now the problem is reduced to a 1D minimization of $\epsilon$.

0
On

The model being $$y=ax+bx^2+\frac{c}{x}+\frac{\sin(dx)}{x^2}$$ it is nonlinear because of $d$ and you need at least reasonable estimates for the four parameters.

So, fix $d$ at a given value and define $t_i=\frac{\sin(dx_i)}{x_i^2}$. You then face a problem of multilinear regression without intercept; this is easy to solve.

So, for a given value of $d$, you have $a(d)$, $b(d)$, $c(d)$ and $SSQ(d)$ that you want to minimize. So run it for different values of $d$ until you see a minimum. At that point, you are ready for the nonlinear regression or optimization.

If you do not access a program for that, zoom more and more around the minimum.

Doing it with your data $$\left( \begin{array}{cc} x & y \\ -0.90 & 5 \\ -1.40 & 1.1 \\ -1.64 & 0.775 \\ -4.00 & 4 \\ -3.28 & 0.9 \\ -2.00 & 0.2 \\ -2.50 & -0.2 \\ -0.75 & 6.5 \\ -1.25 & 3.7 \end{array} \right)$$ the results of the preliminary step are $$\left( \begin{array}{cc} d & SSQ(d) \\ 0 & 2.35088 \\ 1 & 2.35565 \\ 2 & 2.00474 \\ 3 & 2.01157 \\ 4 & 1.77421 \\ 5 & 3.61193 \\ 6 & 6.27886 \end{array} \right)$$

For $d=4$, the parameters are $a=3.83$, $b=1.09$ and $c=-6.67$.