Testing goodness of fit using Kolmogorov-Smirnov test

207 Views Asked by At

I want to check if two probability distributions (experimental and theoretical) are same. The distributions are not normal distributions so I decided to use KS test. I used the MATLAB function KStest2 and got p-value = 1! Now, it means that I can't reject the null hypothesis that the two distributions are same. I have two main concerns:

  1. Does it mean I can accept the null hypothesis? I'm confused about the statement 'fail to reject the null hypothesis'
  2. What is the p-value for the hypothesis that distributions are same. Can I calculate it as 1-p? As I'm interested in testing whether my theory is correct and want to give a p-value for that.

https://se.mathworks.com/help/stats/kstest2.html

3

There are 3 best solutions below

0
On

Let's review some basics on which you may be confused.

In a p-value test we have a hypothesis, called the null hypothesis, from which probabilities are computable, then use a p-value to quantify how well the hypothesis "fits" some data. The p-value is the probability, conditional on the null hypothesis, that the data would be at least surprising, relative to the expectations of said hypothesis, as it in fact was. (When I say that, I'm glossing over the difference between 1- and 2-tailed tests; in 1-tailed tests, the p-value is the probability that the data would be at least this surprising, in the direction in which it is surprising.)

In this example, the null hypothesis is that the distributions are the same, so $p$ is already the p-value for that hypothesis. The only event that we know has probability $1-p$ is that the data would be less "surprising", again conditional on the null hypothesis. We certainly can't do another test in which the role of null hypothesis switches to the opposite of what it was before; "the distributions differ" doesn't allow us to calculate p-values.

I think that answers your second question. As for the first, the reason we talk about "failing to reject" the null hypothesis is because you can't prove it, only disprove it or be impressed it survived the effort. As for what you can do in this example, I suggest you double-check a p-value of 1. Such a p-value means the data is as consistent with the distributions being the same as it could possibly get. With data drawn from a continuous distribution, this is suspicious.

1
On

If I'm understanding your questions correctly it seems that you have one distribution which is given by a formula (theoretical) and another that is given by data (experimental). Thus only one of your distributions is given by a sample, the experimental one. Thus you should be using the $\textit{one sample}$ K-S test. This test is designed for what you have in mind (ie determining if the underlying theoretical distribution for the experimental distribution is the theoretical one that you have).

The two sample test is for determining whether two experimental distributions have the same underlying theoretical distribution.

Now on to $p$-values. I don't like the whole "null-hypothesis" language, as I think it's overly confusing. The thing to get used to in statistics is that there is no absolute notion of true vs. false when it comes to experimental data. It's all about degrees of confidence.

So if we take for example the case of flipping a fair coins. The theoretical distribution is a discrete distribution with heads and tails each having probability $\frac{1}{2}$. If I were to flip a coin and get 100 heads in a row what does that mean? Do it mean that my coin isn't fair?

No, it only means that it is very unlikely to be fair. I suggest you try to work out the K-S test (one sample) for this example, it is very illuminating.

Finally, if I told you to make a decision on whether the coin was fair in this case you would probably say no. This is what the $p$-value is. It is a quantifiable number that says what level of confidence you need to have before you're going to make conclusions based on the data. There is no set in stone preferred value, it just depends on the application.

0
On

You haven't told us much. In particular what sample size you used. If you make a quick superficial search for something and you don't find it, that doesn't prove it's not there. Likewise, if a very small sample is used and your test fails to reject the null hypothesis, that doesn't mean the null hypothesis is true; it just means you haven't looked very far for whatever evidence against it may be there.