Least Square fit for signal data (360 points)

137 Views Asked by At

I would like to analyze data to get the maximum value out of 360 points. I used least square fitting because I get the data from signal strengths. I want to remove any outliers I get from my data which is highly possible since the signal strength can be really not reliable sometimes. the data I have is really big (360), which will require a very high-degree polynomial. I want someone to help me to solve this issue since I tried using a very high degree polynomial in Matlab and it didn't work. I also tried dividing the data into chucks of data and process them seperately and then get the highest among them, but it will take lots of processing when I convert the algorithm from Matlab to C++.

2

There are 2 best solutions below

11
On

It is a bad idea to use polynomials of degree greater than 7 or so (just a rule of thumb), because you will get enormous oscillations that do not represent your data at all. You should first make some assumption on how those datapoints should look like, otherwise you cannot make any statement about what outliers look like / whether there is a maximum etc.

Can you tell us a bit more where you got your datapoints from, and perhaps post a plot of your datapoints so we get an impression how they look like?

0
On

An example of the problems with polynomial fitting is discussed in Polynomial best fit line for very large values

The optimal degree of fit can't be quantified without understanding the data and what you want with the fit (interpolation, integration, etc.). In general, you can expect the total error to drop initially as order of fit $d$ increases. It may hit a minimum (error increases as $d$ grows), or it may plateau.

The values for the amplitudes will oscillate, and you may wish to look at the amplitudes for an orthogonal polynomial set. Remember, orthogonal polynomials are not orthogonal when discretized.

Finally, look at the error in the fit amplitudes. You typically see that the signal drops below the range. You spend a lot of computation time calculating very expensive zeros.

These errors will provide you with the ability to quantitively remove data points using a standard deviation threshold. For example, exclude all points that are five deviations away from the prediction.

If we can get to your data we can provide more insight.