Simple COVID Probability Question

150 Views Asked by At

This is a relatively simple question that I have been struggling to answer for some reason. When conducting a thought experiment, I was wondering what the approximate probability of catching COVID would be if one were to host a 15 person house gathering. Just as a reference, if approximately 3% of the U.S has either come into contact with the virus or has it, and we were to account for 15 people in a confined space (approximately 4 independent families), how would one calculate the probability of one person catching the virus in a 15 person get-together?

Also, more broadly, what is the general structure of a statistical problem that outlines this kind of an issue? Is there a method or formulaic approach used to calculate questions like these? (I'm new to statistics and learning more each day!)

2

There are 2 best solutions below

0
On

You need to know the probability $P_v$ of catching the virus if you are in contact with someone who has. The probability that no one has the virus among 15 people (assuming independence - which is questionable, since they are in family groups) is $ P_0=0.97^{15}$, so your infection probability is $P_v(1-P_0)$.

An additional complication to the calculation is that $P_v$ would be somewhat dependent on the number of people present who are infected. However in this scenario, the effect would be small.

0
On

There are a few things to consider:

  • The epidemic reproduction rates: Basic reproduction rate, $R_0$, Effective reproduction rate, $R = R_0 x$, where $x$ is the susceptible population
  • The infection probability distribution as a function of age. These distributions are available from WHO, US CDC and other national health bodies.
  • The infection probability distribution due to co-morbidity factors. These distributions are available from various clinical research studies. Of course, there are many co-morbidity factors and research is evolving. for eg: co-morbidity involving diabetes, hypertension, respiratory diseases etc.,
  • Grouping of family units and distribution of member age within family groups. If you set a maximum number of families (unspecified in your case) and a maximum attendee count (15 in your case), you can calculate the combinations of different family unit sizes. This is essentially a partitioning with constraints subproblem. This can be done using text book methods of counting.
  • Proportion of attendees who have recovered from the disease
  • The probability distribution of encountering individuals who have already recovered from the disease. There are two theories you could operate under: (a) recovered individuals do not get infected again (b) recovered individuals lose immunity after a period of $t$. This research is also evolving and no evidence yet. But, current belief is scenario (a)
  • The probability distribution of disease recurrence in recovered individuals. This is if you want to factor scenario (b) above
  • The probability distribution of asymptomatic individuals who don't show symptoms but are carriers of the disease. Some data is available.
  • The disease thresholds - Herd immunity and Herd immunity thresholds
  • The probability distribution of disease immunity in a population. i.e., people who may never contract the disease. This distribution is not known currently and will be known only when large number of studies are done.
  • Overall disease prevalence vs. population in the geographic area under consideration. This data varies from country to country and state to state and available from the local/national health bodies. This matters because the infection curves are believed to be in different stages in different countries (eg: Singapore vs. India vs. China vs. New Zealand vs. US etc.,)
  • Detectability of the disease: How accurate are the tests that show a negative result and the probability of having an individual who tested false negative attending the event

There are probably other factors that you want to consider as well in your model. Since the disease is still evolving, we may have to operate with some uncertainty or incompleteness in the model.

The general technique involves determining the probability distribution for the various events and then combining them using tools such as:

  • Inclusion-Exclusion principle and counting
  • Bayes' theorem for conditional probability
  • Bernoulli trials and Poisson sampling

References:

Rothman KJ, Lash T, Greenland S. Modern Epidemiology (3rd ed.), Lippincott Williams & Wilkins, 2013

Coggon D, Rose G, Barker DJP. Epidemiology for the uninitiated, fourth edition, The BMJ. URL: https://www.bmj.com/about-bmj/resources-readers/publications/epidemiology-uninitiated