Polynomial fitting where polynomial must be monotonically increasing

10.3k Views Asked by At

Given a set of monotonically increasing data points (in 2D), I want to fit a polynomial to the data which is monotonically increasing over the domain of the data. If the highest x value is 100, I don't care what the slope of the polynomial is at x=101.

I'd also like to keep the degree of the polynomial below 7. My data usually has about 20 (x,y) pairs. Least squares fitting is probably best, but I'm open to other techniques.

I was thinking that maybe I could create a b-spline, sample that b-spline (generating, say, another 100 data points), then perform a polynomial fit. Does that approach sound sound?

5

There are 5 best solutions below

4
On BEST ANSWER

I think the problem is not precisely stated: its unclear what "minimize the degree of the polynomial, while fitting fairly tightly to the data" means (what is "fairly tightly?").

Anyway, for fixed degree, this may be formulated as a semidefinite program which can be solved approximately in polynomial time; there is a lot of software out there that can solve semidefinite programs very efficiently, e.g., sedumi.

Indeed, suppose your data points are $(x_1,y_1), (x_2,y_2), \ldots, (x_n,y_n)$. Let $y$ be the vector that stacks up the $y_i$ and let $V$ be the Vandermonde matrix $$V = \begin{pmatrix} 1 & x_1 & x_1^2 & \cdots& x_1^n \\ 1 & x_2 & x_2^2 & \cdots & x_2^n \\ \vdots & \vdots & \vdots & & \vdots \\ 1 & x_n & x_n^2 & \cdots & x_n^n \end{pmatrix}$$ Then, you'd like the minimize $||y-Vp||_2^2$ subject to the constraint that the entries of the vector $p$, which we will call $p_i$, correspond to a monotonic polynomial on your domain $[a,b]$, or, in other words, $p'(x) \geq 0$ on $[a,b]$: $$ p_1 + 2p_1 x + 3 p_2^2 + \cdots n p_n x^{n-1} \geq 0 ~~~ \mbox{ for all } x \in [a,b].$$ This last constraint can be written as a semidefinite program,as outlined in these lecture notes. I will briefly outline the idea.

A univariate polynomial which is nonnegative has a representation as a sum of squares of polynomials. In particular, if $p'(x) \geq 0$ on $[a,b]$, and its degree $d$ is, say, even, then it can be written as $$p'(x) = s(x) + (x-a)(b-x) t(x),~~~~~~~~(1)$$ where $s(x), t(x)$ are sums of squares of polynomials (this is Theorem 6 in the above-linked lecture notes; a similar formula is available for odd degree). The condition that a polynomial $s(x)$ is a sum of squares is equivalent to saying there is a nonnegative definite matrix $Q$ such that $$ s(x) = \begin{pmatrix} 1 & x & \cdots x^{d/2} \end{pmatrix} Q \begin{pmatrix} 1 \\ x \\ x^2 \\ \vdots \\ x^{d/2} \end{pmatrix}.$$ This is Lemma 3 in the same lecture notes.

Putting it all together, what we optimize are the entries of the matrices $Q_1,Q_2$ that make the polynomials $s(x)=x^T Q_1 x,t(x)=x^T Q_2 x$ sums of squares, which means imposing the constraints $Q_1 \geq 0, Q_2 \geq 0$. Then Eq. (1) is a linear constraint on the entries of the matrices $Q_1,Q_2$. This gives you you have a semidefinite program you can input to your SDP solver.

1
On

Well, it is definitely not the best answer from the practical point of view, however, it does two things. Firstly, it proves an existence of such polynomial. Secondly, by working out some constants, one can try to get an upper bound on the degree.

Suppose, that we are given $x_0<x_2<...x_n$ and $y_0<y_2<...y_n$ and we are looking for an increasing polynomial such that $P(x_i)=y_i,$ $0\le i\le n.$ It easy to construct a $C^1[x_0,x_n]$ function $f,$ such that $f(x_i)=y_i$ and $f'(x)\ge\delta>0$ for some real $\delta.$ For each $\varepsilon > 0$ there exists a polynomial $P_{\varepsilon}$ such that $$\|f'-P_{\varepsilon}\|\le \varepsilon.$$ Consider $$Q_{\varepsilon}(x)=y_0+\int_{x_0}^{x}P_{\varepsilon}(t)dt.$$ Then, $\|f-Q_{\varepsilon}\|\le \varepsilon (x_n-x_0).$ Let $R_{\varepsilon}$ be the Lagrange polynomial, which interpolates $R_{\varepsilon}(x_i)=y_i-Q_{\varepsilon}(x_i),$ $0\le i\le n.$ Consider a liner operator $A: \mathbb{R}^{n+1}\to \mathbb{R}^m,$ such that $A(z_0,z_1,...z_n)=R'(z_0,...z_n),$ where $R'(z_0,...z_n)$ is a derivative of Lagrange polynomial $R,$ such that $R(x_i)=z_i.$ Since this operator is a linear operator between two finite dimensional spaces, there is a constant $C,$ such that $\|R'\|\le C\max |z_k|.$ Choose $\varepsilon > 0 $ such that $C(x_n-x_0)\varepsilon+\varepsilon \le \delta.$ For $R_{\varepsilon}=R_{\varepsilon}(y_0-Q_{\varepsilon}(x_0)...y_n-Q_{\varepsilon}(x_n)),$ we have $\|R'\|\le \delta-\varepsilon$ and it is easy to see that $R_{\varepsilon}+Q_{\varepsilon}$ satisfies all the requirements.

2
On

Ignoring the question of minimizing degree, it is certainly possible.

One can find a positive function $f$ for which, if $p'$ is close to $f$ on the whole interval then $p = \int^x f(t) dt$ (with integration constant chosen so that $p(x_1)$ is close to $x_1$), then $p(x)$ will approximate the data points to given accuracy. Choose some positivity-preserving approximation scheme such as Bernstein polynomials to get a positive polynomial $p'$ within any desired distance of $f'$ on an interval containing all $x_i$. Integrate to produce $p$.

7
On

I'm presuming that since you said "fit", the polynomial doesn't have to pass through your given points.

I would say that LS indeed is good enough for your purposes. You can do things stepwise (if you fit using orthogonal polynomials, this is quite easily done) and either monitor your polynomials graphically (see plots of your polynomial and the original points) or monitor quantities like the (adjusted) $R^2$ as you increase the degree of your fitting polynomial.

0
On

If there is no actual constraint that the function must be a polynomial, you can adopt a trend filtering approach. See the python library trendfilter (which I wrote). Rather than controlling smoothness through the number of polynomials (a parametric approach), you do so directly with regularization coefficients. Effectively, you are saying: I want a smooth function unless there is a enough evidence in the data telling me that it must be less smooth. And you can set monotonic to True and it will force the first derivative to be non-negative exterywhere.

def trend_filter(x, y, y_err=None, alpha_2=0.0,
             alpha_1=0.0, alpha_0=0.0, l_norm=2,
             constrain_zero=False, monotonic=False,
             return_function=False):