Test for Validity of Artificial Samples

32 Views Asked by At

I have a model that actually is learned from some observed samples. Then I use the model to generate several artificial data.

My question is: Which test should I use to test if the data is of the same validity as the original observed data? (If possible, my data is based on extreme value distribution so it will be better if a specific test could be used to test. ) Thank you very much!

1

There are 1 best solutions below

5
On

I'm not sure to which extent your questions makes sense, but I'm pretty sure the answer I'm about to give is not the one you want.

You can use Kolmogorov-Smirnov (or Anderson-Darling I suppose) to check if your data (i.e. the empirical distribution) is significantly different from your modeled distribution. However, my attitude in general is that if the K-S test is the answer, you probably asked the wrong question.

One problem with this approach is that if you have a lot of data, K-S will probably reject the model even if it's a fairly good model; and if you have little data, K-S will probably accept the model unless it is a particularly bad fit.

However, the truly important question to ask, and which this approach does not address, is (usually, and for whatever question you are originally interested in):

Is there an alternative model which fits the data equally well (or roughly so), but which would otherwise lead you to a different conclusion?

Without knowing the details of what the original question is, it's hard to be more concrete.