- A box has 4000 red, 5000 blue and 1000 orange balls. A selection of 70 balls is made, with 25 reds, 35 blues, and 10 oranges being observed. Can one essentially prove that the selection was NOT a simple random sample with replacement from the box?
I have already found the expected values and compared them to the observed values. I attempted the try and find the standard error so that I could formulate a z-test to analyze the probabilities of each of these actually happening. However, I do not know if that is the correct course of action or not. I am just really confused and do not know exactly how to proceed past finding the expected value and comparing it to the observed value.
- A 6-sided die is rolled 90 times and 20 two’s come up.
a. State the null and alternative hypotheses when you’re trying to see if you can essentially prove that the die has more than a 1/6 chance of coming up two.
b. Construct your test statistic and state its distribution when the null hypothesis is true.
c. Conduct your test to see if you have essentially proven that the die has more than a 1/6 chance of coming up two
For this, I have attempted to construct a z-test as well. However, I do not know how to find the standard error for the sample without being given the standard deviation of the sample. The goal is to determine whether the die is loaded or not and I am not sure what the threshold should be to determine that for sure. Thank you all for your help and I look forward to hearing your suggestions.
You appear to have made some progress towards both of these elementary problems. I will give some extensive hints, but not formulate the answers exactly as prescribed, leaving that part to you. Even though both of these problems have a goodness-of-fit 'flavor', they really are separate problems. In future postings on this site, you will probably get more enthusiastic help if you put one problem per post.
(1) In the language of the chi-squared goodness-of-fit (GOF) test, you have observed 25, 35 and 10 balls in categories with respective expected counts 28, 35, and 7 (which you say you have already found). Then the chi-squared GOF statistic 1.607, which does not exceed the 95th percentile of the chi-squared distribution with 2 degrees of freedom, so one cannot reject that the data fit the model.
(2) One can do this directly using the binomial distribution. Let $X \sim Binom(90, 1/6)$ so $E(X) = 90/6 = 15$ and $V(X) = 15(5/6) = 12.5,$ so $SD(X) = 3.536$. (Here you see that you $do$ in fact have a standard deviation.)
What is the probability of seeing 20 or more twos? Direct computation with software gives 0.1043. The normal approximation with continuity correction would give $1 - \Phi((19.5 - 15)/3.536) \approx 0.1016$ from printed normal tables. In either case, around 10%. so seeing 20 (or more) twos, would not be surprising.
Noted: (a) By the Empirical Rule, about 95% of the probability in $Binom(90, 1/6)$ is concentrated in the interval $\mu \pm 2\sigma$ or about $15 \pm 7$, which includes 20. (b) Question (2) could also have been answered using GOF test for categories Twos and Non-Twos, using the distribution $Chisq(df=1),$ but without a continuity correction.
The figure below shows (1) The density of $Chisq(df=2)$ with the observed value 1.61 of the GOF statistic (red) and the critical value 5.99 (dashed) for a test at level 5%, and (2) the PDF (bars) of Binom(90, 1.6) with the observed number 20 of twos (red), and the critical value (dashed) for a test at level 5%. The approximating normal PDF (blue curve) is also shown.