Given a set of points
$\{(x_0,y_0),(x_1,y_1),...,(x_n,y_n)\}$ where $0\leq x_i < 1$ and where the $y_i$ are noisy
what method can be used to find a smooth, monotonic, sigmoid-like function to model them?
More particularly, I seek to find $f:[0,1)\rightarrow \mathbb{R}$ that minimizes $\sum_{i=0}^n (f(x_i)-y_i)^2$ subject to:
- Monotonicity: $x_a\leq x_b \implies f(x_a)\leq f(x_b)$
- Horizontal end-points: $f'(0)=f'(1)=0$
- "Smoothness constraints"
- Sigmoid: $f''$ is unimodal
One possible approach I have in mind is to identify a set of basis functions that do themselves comply with constraints 1, 2, and 3 above. Then any linear combination of those points must also comply with 1, 2, and 3. Then the regression becomes a matter of finding the linear combination of those basis functions that minimizes the square error, a linear-least-squares problem. This approach unfortunately does not in general enforce condition 4.
Can you suggest a method or a suitable family of basis functions?
If you want a sigmoid-like function, then a good choice might be a sigmoid function! Let $$\sigma(x) = \frac{1}{1+e^{-x}}.$$ We can scale it and shift it by introducing parameters to obtain $f(x;\alpha,\beta) = \sigma(e^\alpha x+\beta)$. $e^\alpha$ is there to enforce that the function is increasing. This function satisfies all of your constraints an only has 2 degrees of freedom to optimize over. You should then focus on minimizing $$loss(\alpha,\beta) = \sum_{i=0}^n(f(x_i)-y_i)^2.$$ This can be solved efficiently using Gauss-Newton methods.