How to effectively detect outliers in a feature space?

200 Views Asked by At

Suppose I have feature vectors of $N$ dimension. However, there are outliers in these vectors, meaning that it has an Euclidean distance larger than certain threshold $K$ from the true center of these feature vectors. These outliers is just a small proportion of the total number of feature vectors. How can I effectively detect these outliers?

At first, I am thinking of taking average as the pseudo center and than remove those vectors that are larger than certain threshold $K'$. However, I am not sure if this method works. Will there be cases where taking average will fail? If so, how can I solve this problem effectively without incurring too much computational costs?