How to check hypothesis in statistical data?

68 Views Asked by At

I have a statistical problem.

In a city there are some hostels which differ by the number of rooms. The input data are the following.

In a table there is information about hostels and corresponding number of rooms.

\begin{array}{|c|c|c|c|} \hline Room \ count & 0-30& 30-60& 60-90 & 90-120 & 120-150 & 150-180 \\ \hline Number \ of \ objects & 66& 28 & 15 & 9 & 1 & 1\\ \hline \end{array}

In this problem I have to calculate some statistical parameters such as mean, standard deviation and so on. This is not interesting.

But the thing I can't do is the following. How to check hypothesis that average object has more than 30 rooms with level of confidence 0.95?

I think I have to use somehow integral function of the Laplace, but I'm not sure that it's correct... And by the way I don't think that data has normal distribution law.

How should I check hypothesis?

1

There are 1 best solutions below

0
On BEST ANSWER

You can formulate this as a binomial test of a proportion and use either the Wilson score interval or the Clopper-Pearson interval.

Specifically: let our sample space be all hostels under study and assume random sampling. Let X be a random variable that takes the value 1 if a selected hostel has at most 30 rooms and 0 otherwise. For a sample of N rooms the sum if N X's will have a binomial(n,p) distribution. What we want to test is:

H0: p = 0.5 vs Ha: p<0.5

If p <0.5 then the average hostel has more than 30 rooms.

Now you may think I have ignored the actual magnitude of the rooms per hostel, but that information is relevant to a different hypothesis:

H: what is the average number of rooms per hostel.

Contrast this with your hypothesis: does a hostel, on average, have more than 30 rooms. This is a categorical hypothesis.

To answe the second hypothesis you need to either assume something about the distribution of rooms per hostel or test it unde a very pessimistic scenario, where each hostel has either 0 or 31 rooms and then simulate the null distribution if the sample average and compare it to your actual sample value. If you reject at the 95% level then you are good to go.