Approximating a sigmoid curve from noisy data

296 Views Asked by At

Given a set of points

$\{(x_0,y_0),(x_1,y_1),...,(x_n,y_n)\}$ where $0\leq x_i < 1$ and where the $y_i$ are noisy

what method can be used to find a smooth, monotonic, sigmoid-like function to model them?

More particularly, I seek to find $f:[0,1)\rightarrow \mathbb{R}$ that minimizes $\sum_{i=0}^n (f(x_i)-y_i)^2$ subject to:

  1. Monotonicity: $x_a\leq x_b \implies f(x_a)\leq f(x_b)$
  2. Horizontal end-points: $f'(0)=f'(1)=0$
  3. "Smoothness constraints"
  4. Sigmoid: $f''$ is unimodal

One possible approach I have in mind is to identify a set of basis functions that do themselves comply with constraints 1, 2, and 3 above. Then any linear combination of those points must also comply with 1, 2, and 3. Then the regression becomes a matter of finding the linear combination of those basis functions that minimizes the square error, a linear-least-squares problem. This approach unfortunately does not in general enforce condition 4.

Can you suggest a method or a suitable family of basis functions?

1

There are 1 best solutions below

1
On

If you want a sigmoid-like function, then a good choice might be a sigmoid function! Let $$\sigma(x) = \frac{1}{1+e^{-x}}.$$ We can scale it and shift it by introducing parameters to obtain $f(x;\alpha,\beta) = \sigma(e^\alpha x+\beta)$. $e^\alpha$ is there to enforce that the function is increasing. This function satisfies all of your constraints an only has 2 degrees of freedom to optimize over. You should then focus on minimizing $$loss(\alpha,\beta) = \sum_{i=0}^n(f(x_i)-y_i)^2.$$ This can be solved efficiently using Gauss-Newton methods.