Curve fitting on dataset

736 Views Asked by At

For my master's thesis I'm writing on a specific subject which requires curve fitting. In the first part I fixed everything with 12th degree polynomial fits. But when I derive the data from the place measures, to get the speeds, I get a curve which is hard to fit.

The curve looks like a sinewave, but they go much more pointy on the minimas. Does anyone have any idea about what a good polynomial or other function would be for this kind of curves?

My Curve

I have tried this already with 12th degree polynomial and with some sorts of sine wave. But maybe it might be a good idea to combine a sine wave and a triangle wave?

EDIT: as people are advising me to get into sine waves, the reason why I don't do this is because I need to fit a lot of datasets which are completely different to this dataset. I made another screenshot to show that sine waves are not a real option for me.

This dataset would be way harder with a sine wave

1

There are 1 best solutions below

10
On BEST ANSWER

Here's a quick example of using a change point type method. The model.

\begin{equation} y(x) = \begin{cases} \alpha_0 + \alpha_1x + \alpha_2x^2 & x \leq c \\ \beta_0 + \beta_1x + \beta_2x^2 & x > c \end{cases} \end{equation}

We have 6 regression parameters, and a break-point parameter $c$. However, we want this model to be continuous at $c$, so we should impose the constraint: \begin{equation} \alpha_0 + \alpha_1c + \alpha_2c^2 = \beta_0 + \beta_1c + \beta_2c^2 \end{equation}

Or equivalently, $A \equiv \alpha_0 = \beta_0 + (\beta_1-\alpha_1)c + (\beta_2-\alpha_2)c^2$. So now we have 6 free parameters.

Here's a simulated data set which I will try this out on. enter image description here

I'm using R, which is pretty easy to use. If you try to use the $\texttt{nls}$ function, it may complain about identifiability. But we can do this using $\texttt{optim}$

First, I write a simple model function, given a vector of parameters and the $x$ data. Here, par = $[\alpha_1, \alpha_2, \beta_0, \beta_1, \beta_2, c]$

 model <- function(par, x){
   n <- length(x)
   res <- rep(0,n)
   for(i in 1:n){
      A0 <- par[3] + (par[4]-par[1])*par[6] + (par[5]-par[2])*par[6]^2
      if(x[i] <= par[6]){
         res[i] <- A0 + par[1]*x[i] + par[2]*x[i]^2
      }else{
         res[i] <- par[3] + par[4]*x[i] +par[5]*x[i]^2
      }
   }
   return(res)
 }

Then we write a simple function to return the sum of squares. This is the function we want to minimize.

 sum_squares <- function(par, x, y){
    ss <- sum((y-model(par,x))^2)
    return(ss)
 }

Finally, we minimize the sum of squares with $\texttt{optim}$. Note that you will need to find good initial guesses to converge to the correct solution.

 #I found these initial values with a few minutes of guess and check.
 par0 <- c(7,-1,-395,70,-2.3,10)
 sol <- optim(par= par0, fn=sqerror, x=x, y=y)$par

When I plot my solution, this is what I get.enter image description here

You may need to include third order terms for your data. But this certainly avoids having to consider 12th order polynomials! Plus the interpretability of the change-point is a nice feature.

Best of luck!