estimating probability of a point sampled from a distribution

69 Views Asked by At

Assume we have a d-dimensional vector space.Also, we don't have a parametric distribution function, only we have a set of samples in this space which we assume is sampled from one unknown distribution ,what is the best way to compute the probability that a given new point in this space is sampled from the unknown distribution which data is sampled from?

1

There are 1 best solutions below

2
On BEST ANSWER

The question itself is incorrectly posed:

"...probability that a given new point ... is sampled from the unknown distribution..."

To answer that, you need to have some alternative set of distributions, as well as a prior probability that your (initial) distribution is true.

For example, suppose you have a distribution that allows large outliers (e.g. Cauchy). Then whenever you get an outlier, it will appear completely incompatible with the points you've had so far.

However, if you'd like to know "how similar is the new point to the ones already observed", you might try kernel density estimators. Based on the previous sample, $x_1,...,x_n$, calculate a KDE, $f(x)$ and then plug in the new observation $x$. Maybe you can detect the outliers this way.