Hello everyone, I am Julieta this time I get stuck in the following exercise. It is a statistical analysis of pools, the statement is quite long I will try to keep it short and put some links.
Note: Like in my other questions I asked here,I consider beacause the extension of this problem I will given at least 50 pts for the explanation of the problem. Also I am sorry for my broken grammar, english is my second language and I still improve it. :-)
The problem was a real statistic problem that had social implication, and is explaned in the next article in NY times in detail:
http://www.nytimes.com/1994/04/11/us/probability-experts-may-decide-pennsylvania-vote.html
Data:
Year District DifferenceAbsentee DifferenceMachine
82 2 346 26427
82 4 282 15904
82 8 223 42448
84 1 593 19444
84 3 572 71797
84 5 -229 -1017
84 7 671 63406
86 2 293 15671
86 4 360 36276
86 8 306 36710
88 1 401 21848
88 3 378 65862
88 5 -829 -13194
88 7 394 56100
90 2 151 700
90 4 -349 11529
90 8 160 26047
92 1 1329 44425
92 3 368 45512
92 5 -434 -5700
92 7 391 51206
93 2 1025 -564
- There are several question and I will add my improvement in the mean time I can sussesfully reach an answer.
I run SPSS over the given data and I obtained the following information, which I belive from this can be answer the all the following questions:
(a) Find the p-value for the test H0 : β0 = 0 in your output. Explain (as you would to the judge in this case) what this number tells us. Interpret the results of the test in the context of the problem.SOLVED
My ans: P-value=0.984 From the p-value obtained we have not enough evidence to reject the Ho in favor of the alternative, therefore Bo is not significative difference than 0. In the context of the problem this means that if there is no difference in Machine votes (DifferenceMachine=0) then we cannot say that the DiferenceAbstentees is different from 0.
(b) Dr. Ashenfelter found that “the difference between the Democratic and Republican tallies in the machine-based vote has been a good indicator of the difference between the two parties’ absentee vote.” Explain how he could draw this conclusion based on your regression output.SOLVED
My ans: I believe that the regression output is the table wich title is Model summary. But from my point of view Dr. Ashenfelter use that the value of p is sgnificant and therefore is a good indicator like he said. Is my answer correct?
(b) The NY Times article states: “Assuming this relationship in the 21 previous elections had held in the most recent, Professor Ashenfelter estimates that the Republican’s 564-vote edge on the machines should have led to a 133-vote advantage in absentee ballots.” Explain (as Dr. Ashenfelter would explain to the judge), how one can come to this conclusion.SOLVED
My answer: I used the model y=b0+b1*x where b0 and b1 are the given from the model with 21 data point and not 22 and the anser is 133 as the statement said.
(c) In the contested election the voting machine margin was -564. The absentee ballot margin, however, was 1025. Use your regression model to make a statistical argument for why this observation is unusual. Assume for now that the contested election was fair. Derive a probability for observing an absentee ballot margin as large or larger than the one observed if the election was fair.
My answer: I am thinking I need to compute p(z>[(1025-133)/sqrt(MSE)]>2.74) but the p value is not even close to 0.06. Why??
(d) Dr. Ashenfelter made a similar argument you just made and came up with a p-value of 0.06 for the test that decides
H0 : the election was fair vs. Ha : there was fraud by the democrats
The NY Times reporter interpreted this result as follows: “Putting it another way, if past elections are a reliable guide to current voting behavior, there is a 94 percent chance that irregularities in the absentee ballots, not chance alone, swung the elec- tion to the Democrat, Professor Ashenfelter concludes.” Critique the reporter’s interpretation of the p-value. If the interpretation is correct, explain why. If the interpretation is incorrect, provide a correct interpretation instead.
QUESTION: This is the last question that left answer, if someone know a good explanation. I belive the journalist is correct but I don't know how to verify this.
Conclusion: Please, let me know if I need to improve something or change something. I will keep my work update, and like I said before I will be given points in the future. (I feel bad you reed such a long problem).
THANKS AGAIN.


Point
b) The regression of absentee margin on machine margin is signicant (we can see that by looking at the p-value for the slope in the regression model, p = 0:000). This means that the machine margin is a signicant linear predictor for absentee margin. Further, the R2 of the simple linear regression is 0.489. This means that 48.9% of the variation in absentee vote dierence can be explained by regression on machine vote dierence. The correlation between machine vote dierence and absentee vote dierence is 0.699.
d) The interpretation is incorrect. The reporter says that 1 p is the probability that there was fraud, implying that p is the probability that the null hypothesis is true. That's not the case. The p-value if the probability to observe a more extreme out- come (even fewer republican absentee votes) if the election were in fact, fair. That means that 1 p is the probability to observe more republican absentee votes than were observed in the contested election if the election were fair.
Let me know if you still want me the other points.