Probabilistic regression on outliers

102 Views Asked by At

I have a given data set $D = \{ x_i, y_i \}_{i=1}^n$ for a regression problem. When I plot the data, it looks like there is an underlying parabola (2nd order linear model) and some outliers.

I want to design an approach using a probabilistic model with a latent binary variable $\{ 0,1 \}$ indicating whether a data point is an outlier or not.

Currently I have no idea what I could do, what would the parameters be in this cause and how are they optimized? Is Expectation Maximization an idea?

1

There are 1 best solutions below

4
On BEST ANSWER

My recommendation is to use robust regression. It is simpler and downweights the outliers.