Help: SPSS and Data Interpretation of Voters. Republican vs. Democrats (1993 election)(Almost finished)

353 Views Asked by At

Hello everyone, I am Julieta this time I get stuck in the following exercise. It is a statistical analysis of pools, the statement is quite long I will try to keep it short and put some links.

Note: Like in my other questions I asked here,I consider beacause the extension of this problem I will given at least 50 pts for the explanation of the problem. Also I am sorry for my broken grammar, english is my second language and I still improve it. :-)


The problem was a real statistic problem that had social implication, and is explaned in the next article in NY times in detail:

http://www.nytimes.com/1994/04/11/us/probability-experts-may-decide-pennsylvania-vote.html

Here is the summary: enter image description here

Data:

 Year   District    DifferenceAbsentee    DifferenceMachine
  82          2              346                    26427
  82          4              282                    15904
  82          8              223                    42448
  84          1              593                    19444
  84          3              572                    71797
  84          5             -229                    -1017
  84          7              671                    63406 
  86          2              293                    15671 
  86          4              360                    36276 
  86          8              306                    36710 
  88          1              401                    21848
  88          3              378                    65862
  88          5             -829                   -13194 
  88          7              394                    56100
  90          2              151                      700
  90          4             -349                    11529
  90          8              160                    26047
  92          1             1329                    44425
  92          3              368                    45512
  92          5             -434                    -5700 
  92          7              391                    51206
  93          2             1025                     -564

  • There are several question and I will add my improvement in the mean time I can sussesfully reach an answer.

I run SPSS over the given data and I obtained the following information, which I belive from this can be answer the all the following questions:

enter image description here

(a) Find the p-value for the test H0 : β0 = 0 in your output. Explain (as you would to the judge in this case) what this number tells us. Interpret the results of the test in the context of the problem.SOLVED

My ans: P-value=0.984 From the p-value obtained we have not enough evidence to reject the Ho in favor of the alternative, therefore Bo is not significative difference than 0. In the context of the problem this means that if there is no difference in Machine votes (DifferenceMachine=0) then we cannot say that the DiferenceAbstentees is different from 0.

(b) Dr. Ashenfelter found that “the difference between the Democratic and Republican tallies in the machine-based vote has been a good indicator of the difference between the two parties’ absentee vote.” Explain how he could draw this conclusion based on your regression output.SOLVED

My ans: I believe that the regression output is the table wich title is Model summary. But from my point of view Dr. Ashenfelter use that the value of p is sgnificant and therefore is a good indicator like he said. Is my answer correct?

(b) The NY Times article states: “Assuming this relationship in the 21 previous elections had held in the most recent, Professor Ashenfelter estimates that the Republican’s 564-vote edge on the machines should have led to a 133-vote advantage in absentee ballots.” Explain (as Dr. Ashenfelter would explain to the judge), how one can come to this conclusion.SOLVED

My answer: I used the model y=b0+b1*x where b0 and b1 are the given from the model with 21 data point and not 22 and the anser is 133 as the statement said.

(c) In the contested election the voting machine margin was -564. The absentee ballot margin, however, was 1025. Use your regression model to make a statistical argument for why this observation is unusual. Assume for now that the contested election was fair. Derive a probability for observing an absentee ballot margin as large or larger than the one observed if the election was fair.

My answer: I am thinking I need to compute p(z>[(1025-133)/sqrt(MSE)]>2.74) but the p value is not even close to 0.06. Why??

(d) Dr. Ashenfelter made a similar argument you just made and came up with a p-value of 0.06 for the test that decides

 H0 : the election was fair vs. Ha : there was fraud by the democrats

The NY Times reporter interpreted this result as follows: “Putting it another way, if past elections are a reliable guide to current voting behavior, there is a 94 percent chance that irregularities in the absentee ballots, not chance alone, swung the elec- tion to the Democrat, Professor Ashenfelter concludes.” Critique the reporter’s interpretation of the p-value. If the interpretation is correct, explain why. If the interpretation is incorrect, provide a correct interpretation instead.

QUESTION: This is the last question that left answer, if someone know a good explanation. I belive the journalist is correct but I don't know how to verify this.

Conclusion: Please, let me know if I need to improve something or change something. I will keep my work update, and like I said before I will be given points in the future. (I feel bad you reed such a long problem).

THANKS AGAIN.


2

There are 2 best solutions below

0
On BEST ANSWER

Point

b) The regression of absentee margin on machine margin is signicant (we can see that by looking at the p-value for the slope in the regression model, p = 0:000). This means that the machine margin is a signicant linear predictor for absentee margin. Further, the R2 of the simple linear regression is 0.489. This means that 48.9% of the variation in absentee vote dierence can be explained by regression on machine vote dierence. The correlation between machine vote dierence and absentee vote dierence is 0.699.

d) The interpretation is incorrect. The reporter says that 1 p is the probability that there was fraud, implying that p is the probability that the null hypothesis is true. That's not the case. The p-value if the probability to observe a more extreme out- come (even fewer republican absentee votes) if the election were in fact, fair. That means that 1 p is the probability to observe more republican absentee votes than were observed in the contested election if the election were fair.

Let me know if you still want me the other points.

0
On

I do not believe the reporter's interpretation of the p-value is correct. The p-value of 0.06 in this case means that if the election were fair, there is a 6% chance that the results would be as extreme as they were. This is not the same thing as saying there is a 6% chance that the election was fair. You also have to take into consideration the likelihood in general of having fraudulent ballots.

Let me give an example. Imagine there is a rare disease that affects one in a billion people. I have a device that can test whether or not a person has this disease. However, the device is not perfect. Even among people who don't actually have the disease, the device will mistakenly test positive 10% of the time. Now imagine I test a random person with this device and he tests positive. Would I say that since the device only errs 10% of the time, therefore there is a 90% chance that this person has the disease? Of course not. While it's true that a person without the disease is unlikely to test positive, it's a lot less likely that he actually has the disease.

The same argument can be made here. Just because we got a result that is only 6% likely given that the election was fair, that does not mean there is only a 6% chance that the election was fair. (Even if all elections were fair, you would expect to get this result about 1 out of 16 times!) It depends on how likely it is in general to have a fraudulent election.

To calculate the probability that the election actually contained fraud given the results we received, it would probably be more accurate to use Bayes' Theorem. For this, we would need some way of estimating the general likelihood of having a fraudulent election (based on prior knowledge, studies, etc.), as well as the likelihood of getting the results we got given that there was fraud in the election.

In conclusion, the p-value of 0.06 does not mean that there is a 94% chance that there was election fraud.