Best regression model for points that follow a sigmoidal pattern

49 Views Asked by At

I have the following list of points :

enter image description here

I'm trying to find the best regression model to fit these points.

The logistic regression is not close enough to the points :

enter image description here

I guess I need something closer to a spline, but I don't know how to compute a regression model based on a spline, all I can find are interpolation models.

Also, I would like to be able to compute the derivative of the regression. With a spline interpolation, I don't know how to compute the derivative over x so that it appears as a function of one-variable.

For context, this is in order to build a tool for acid-base titration for chemistry.

Thanks !

2

There are 2 best solutions below

3
On BEST ANSWER

Fitting of an equation model made of a linear equation and a logistic equation ( blue curve) :

enter image description here

Without data provided on numerical form the points (in red) were obtained by scanning the pixels of the figure joint to the OP question. This is not accurate. Thus the above results are only rough approximates.

IN ADDITION

A variant for $y$ tending to constant for large $x$.

enter image description here

0
On

Generally, I think you might have to try different sigmoid curves and find the best fit among them. It looks like the curve is symmetric about its inflection point, so that rules out the Gompertz curve. Leaves open the logistic curve or hyperbolic tan, among others.

All you need to do a spline is the x and y coordinates.

The cublic spline preserves the interpolated function value and first and second derivatives through the boundary between regions.

$A_ix_{i+1}^3+B_ix_{i+1}^2+C_ix_{i+1}+D_i =A_{i+1}x_{i+1}^3+B_{i+1}x_{i+1}^2+C_{i+1}x_{i+1}+D_{i+1}=y_{i+1} $

$3A_ix_{i+1}^2+2B_ix_{i+1}+C_i=3A_{i+1}x_{i+1}^2+2B_{i+1}x_{i+1}+C_{i+1}$

$6A_ix_{i+1}+2B_i=6A_{i+1}x_{i+1}+2B_{i+1}$

This leads to a series of linear equations for the coefficients. The x,y values are insufficient to fully determine the spline. For that, you need to set the function and/or derivative values for the spline to take on the outer boundary. These might potentially be selected to minimize the error or fit reasonable assumptoins. For example, is the first derivative expected to be zero at large enough values?