Proportion hypothesis test Coronavirus

60 Views Asked by At

I'm requested to do a hypothesis test with some data of Coronavirus.

I'm selecting Perú as the country. I'm attaching the data here that is from this website:

https://coronavirus.jhu.edu/map.html

COVID-19 Perú data April, 6

a. From the confirmed cases with the virus, find the proportion of deaths. Identify the success, $n$, $X$, $\hat{p}$.

success = number of deaths of the confirmed cases

$X=83$

$n=2,281$

$\hat{p}=\frac{83}{2,281}=0.04$

b. From the confirmed cases with the virus, find the proportion of recovered. Identify the success, $n$, $X$, $\hat{p}$.

success = number of recovered of the confirmed cases

$X=989$

$n=2,281$

$\hat{p}=\frac{989}{2,281}=0.43$

c. Find a 95% confidence interval to the proportion of death from the confirmed cases with the virus. Interpret.

$0.04\pm1.96\sqrt\frac{0.43(1-0.43)}{2,281}$

$(0.03,0.04)$

With a 95% level of confidence, it can be said that the proportion of death in the confirmed cases in Perú is between 0.03 and 0.04.

d. Is there evidence that the majority of confirmed cases with the virus recover? Complete all steps of the hypothesis test.

My question is if my null and alternative hypothesis is:

$H_0: p=0.43$ and $H_1: p>0.43$

But I think that 0.43 is my $\hat{p}$ here. My $p$ should be related to the claim which I don't see here. The only instructions I have are to visit the website to get the data and answer these questions.

Please help.

2

There are 2 best solutions below

0
On

There's a significant flaw in the statement. Mainly from the fact that there's delay in the data reflecting the real results. Thus, you could not use such procedure for a dynamically developing system. However, some techniques such as use concavity do seemed to provide reasonable results. Also, you might consider use daily changes to do the fitting, and a log function was very useful.

Also, you needed to separate the data in terms of each health system's response, i.e. an independent model for each country. I derived a calculation from first principle couple of months before for certain region and turned out it matched perfectly. But when trying to match for other countries, the estimated parameters were changed.

Another thing you needed to notice was that, the procedure of obtaining the data were not consistent, i.e. the number of testing, and, in fact, the procedure of confirmation, was not consistent even for the same region.

0
On

I take it there are currently $2281$ confirmed cases with $83$ deaths and $989$ recovered, these $83$ and $989$ being included in the $2281$. The remaining $2281 - 83 - 989 = 1209$ cases are presumably active: eventually these people will either recover or die, and we don't really know how many will do either of those. But it's certainly misleading to take the probability of death as $83/2281$. You could say that the proportion of deaths among the cases that have been resolved is $83/(83+989)$. Of course that is somewhat misleading too, because of selection bias: severe cases (and in particular deaths) are much more likely to be tested than mild cases.