Upper values of a data set

20 Views Asked by At

I have a data set composed of 840 samples. Each sample is contained in the range [0, 100]. Plotting them as a scatter chart it looks like this:

enter image description here

As can be seen in the chart, there are some points which are distributed in the top part (around 80 on average in this example). I need to filter the data set to select only the samples which belong to this trend:

enter image description here

Can you recommend me some approach or algorithm to carry out this?

1

There are 1 best solutions below

0
On BEST ANSWER

Spectral clustering would probably identify those two groups (or it might even identify a third and fourth, at the lower left).

I should mention that from a data science point of view, it's probably not a good approach -- at least not unless the $x$-axis means something other than "these are the numbers I used in gathering the data".