Algorithm to approximate the closest nonlinear formula(funciton) for an arbitrary set of points?

136 Views Asked by At

I have a table which concists of XY points (so I have a set of points hehe), where X represents the Velocity and Y the Real World Speed. Those points are not linear. With two points it's easy to estimate the formula by trial and error. But with more than 2 points I get stuck at. I approximate a few points but then another point is sometimes very far away from the approximation (like 25 km/h off, 1 to 15 is okay, but 25 is too much). when I adjust the formula by trial and error for the point this happens for another random point so I continue to adjust the formula in an infinite loop.

What is the best way to achieve what I need? Are there any programs available for this? Or algorithms which I can apply to my set of points?

An example of my data points is (however the ratios [vel_2/vel_1] can vary from this example):

Vehicle Name|Real Max Speed |Max Velocity  
------------+---------------+-------------
Bike        |35             |0.154487  
Honda NSX   |180            |0.548791  
Airplane    |500            |1.789841

The points are always sorted from low to high in both columns (lower real speed means also lower velocity and the previous velocity is guaranteed to be lower than the last velocity)

1

There are 1 best solutions below

0
On BEST ANSWER

What you are actually looking here for is a simple nonlinear regression. The choice of particular method is based on few aspects of your $f : \mathbb{R} \rightarrow \mathbb{R}$, first:

  • you want to build a simple function which can be easily analyzed and written down or
  • you want to build a function which you can just use (but can be hard to understand)

In first case you should select some particular class of functions, where seems to fit your data. When you plot you points you should be able to see if they form a linear, log-like, polynomial or exp curve. Once you select it, you can perform the actual regression to fit the best parameters (using one of many already developed methods and algorithms, among others: Ordinary Least Squares, Steepest Descent, etc.).

In a second case, when your only concern is to have some $f$, no matter how complicated it gets, you can use any machine learning regression model like Support Vectors Regression, Feed forward neural network, etc. and train it on your data. The only problem with such approach is the fact, that it will require tuning some model's related parameters to achieve good results.

Each of above methods is implemented in vast amount of languages. Some introductary materials for non-linear regression in R can be found here: http://ww2.coastal.edu/kingw/statistics/R-tutorials/simplenonlinear.html