Calculate $1^{st}$ and last percentile without sorting data

47 Views Asked by At

Is there a method of calculating percentiles (even approximately) without sorting data?

I have a lot of data points and I need to find rough min and max values, while ignoring some spikes in the data, so I settled on finding the $1^{st}$ and last percentiles of the data

A sample of the data is below. It comes unsorted and there are millions of data points. In this scenario, I am trying to get min and max while ignoring the outliers, like $-0.9361269409$ and $0.004595814656$.

Is there a way I can do it using standard deviation or weighted average? Or would those processes be computationally as expensive as just sorting the data?

Alternatively, is there a better algorithm for finding approximate min max while ignoring outliers?

Bite, there may be hundreds of these spike outlier points Some of these spikes may also be adjacent to each other, so its not as simple as tracking min/max of 3 or 5 adjacent data points while looping through data points.

I am not asking for the code, just for the process. Thanks,

I saw there is a formula here to get the quantile from a table, but I can't seem to get the $99^{th}$ percentile. If I add instead of subtracting I get a value larger than the max, https://www.statology.org/calculate-percentile-from-mean-standard-deviation/

-0.1623092785
-0.1623014363
-0.3849972863
-0.1623115257
-0.1622996619
-0.162292793
-0.02106845502
-0.01874575794
-0.02084289483
-0.02071095457
0.004595814656
-0.9361269409
-0.3769501916
-0.02146115653
-0.008397232259
-0.1622916339
-0.0217571923