How do I determine the probability that a sample is a stratified random distribution?

26 Views Asked by At

I might not be familiar enough with math terminology to correctly word this question. I think I can better describe it through example.

I have a list of random samples. Each sample has 3 "1" elements that are "randomly" distributed throughout the list.

1: [0, 0, 1, 1, 0, 0, 0, 1, 0]

2: [1, 0, 0, 0, 0, 1, 0, 1, 0]

3: [1, 0, 0, 1, 0, 0, 1, 0, 0]

I suspect that the randomness of each list is "stratified": that the lists are divided into 3rds and the "1" elements are randomly placed in each 3rd.

I imagine there is a way to determine with X degree of confidence that the distribution is stratified and that confidence would increase the larger the sample size.

What is the way to solve for that?

1

There are 1 best solutions below

3
On BEST ANSWER

Edit, see below:

There are two things going on that suggest a manipulated sample. There is exactly a one third to two thirds ratio and each subgroup has this exact ratio also.

One way of looking at this is as a hypothesis test at say the 95% confidence level. The null would be that this distribution is no different from a random distribution of $1$s and $0$s from a population of one third $1$s and two thirds $0$s.

For a sample size of $27$, we have $3$ groups of $9$ with three $1$s and six $0$s. For a single group of $9$, the probability of getting exactly three ones is $p =\binom{9}{3}\cdot \frac{1}{3}^3\cdot \frac{2}{3}^6 = .273129$. Repeated $3$ times for the three groups of $9$, the probability is $p = .273129^{3} = .020375$.

As $.020375 < .05$ we reject the null and conclude that this sample is statistically significantly different from a random sample.

For this kind of test, with more groups of nine, the smaller the p value and the greater the confidence of being a manipulated sample.

Edit: Assuming the ones are placed randomly into the $9$ elements, the probability of getting a distribution of one in each third for a single group of $9$ is $$p = \frac{3^3}{\binom{9}{3}} = \frac{27}{84} = \frac{9}{28}$$ At the $95\%$ confidence level, one would need $x$ groups of $9$ with this distribution where $(\frac{9}{28})^x < .05$. In this case $x = 5$.