Programmatic Cubic Regression

462 Views Asked by At

All,

Thanks in advance for your help. There're a lot of "low hanging fruit" problems at work I need to tackle as a tech-level employee. One of them is curve-fitting 6 data points to a cubic curve, and the other regression questions don't address it as well.

The specific application is calibration of pressure sensors. First, the system calibration curve is set to $f(x) = x$. Next, data points are collected from $0$ to $x$ on the $x$-axis where $x$ is applied pressure and the 6 points are spaced $6/x$ units of pressure apart (the $y$-axis is the current provided by the sensor at a given pressure $x$).

The question is, from a programming perspective, how do I get the coefficients $a, b, c, d$ of the curve $dx^3+cx^2+bx+a$ to which the data points are fitted? In other words, what is a simple generalization/algorithm or pseudo-code of the operation?

Again, thanks for your help!

2

There are 2 best solutions below

0
On

If you tabulate your data in Excel and plot it, you can ask it to Add Trendline, select a degree 3 polynomial, and show trendline equation. That is probably the least work to get to a solution. I don't know how to read the coefficients off in machine-readable form.

Any numerical analysis text will have a whole section on this. I like chapter 15 of Numerical Recipes. Obsolete versions are free on line. You have a linear least squares problem, as your equation depends linearly on the parameters (the coefficients of your cubic), even if non-linearly on $x$.

3
On

Our goal is to find a function of the form $a+bx+cx^2+dx^3$ that gives us the least-squares approximation to a set of data points $\{(x_1,y_1),(x_2,y_2),\dots,(x_n,y_n)\}$

Let $A$ be the matrix $$ A=\begin{bmatrix} 1 & x_1 & {x_1}^2 & {x_1}^3\\ 1 & x_2 & {x_2}^2 & {x_2}^3\\ \,& \vdots & \vdots & \, \\ 1 & x_n & {x_n}^2 & {x_n}^3 \end{bmatrix} $$ Let $z$ be the matrix $$ z = \begin{bmatrix} a\\ b\\ c\\ d\\ \end{bmatrix} $$ and let $y$ be the matrix $$ y = \begin{bmatrix} y_1\\ y_2\\ \vdots\\ y_n\\ \end{bmatrix} $$ Then (assuming no two $x_i$ are the same), the best fit curve will correspond to the unique solution (for $z$) to the matrix equation $$ A^T A\, z = A^Ty $$ Where $A^T$ is the transpose of $A$. That is, $$ \begin{bmatrix} a\\ b\\ c\\ d\\ \end{bmatrix}= (A^T A)^{-1}[A^T y] $$ We could make this run a little faster by using the Cholesky decomposition of the matrix $A^T A$ (see link in the comment below) since for your particular problem, the matrix $A^T A$ will be positive definite, so long as we list the $x_i$ in increasing order. Besides that, you can never go terribly wrong solving a 4 by 4 system with Gaussian elimination.