What is the relation between the infection rate of a virus and the probability that it will die out?

53 Views Asked by At

Problem

Consider the following scenario:

A virus has broken out. Initially, one person is infected. Each day, each infected person recovers but infects $k$ other people, where $k$ is sampled from a geometric distribution with a parameter $p$ and domain $\{0,1,2,...\}$. What is the relation between $p$ and the number of days after which the virus dies out?

In this case, I am particularly interested in scenarios with $p < 0.5$, as in such scenarios the virus tends to not die out. This leads to interesting results.

Mathematics

This process can be formulated as a sequence of integers $\{n_1,n_2,...\}$, where each $n_i$ denotes the number of infected people on the $i$th day. Then, the process is specified by:

$$n_1=1$$ $$n_i \sim \textrm{NegativeBinomial}(n_{i-1},p) \textrm{ for } i>1$$

The negative binomial distribution can be used as it is the sum of iid geometric distributions.

Results

I ran a simulation (code below). In the case that $p < 0.5$, a pattern emerges: the virus is likely to either die out in the first few iterations or not at all. This plot shows the trajectories of the simulations (crosses indicating the virus died out):

enter image description here

Then, I plotted the distribution of the number of iterations after the virus died out for 10000 runs with each a maximum of 100 iterations. The right spike results from the runs where the virus does not naturally die out and is instead automatically halted.

enter image description here

Finally, I was curious about the relation between the value of $r$ and the relative frequency where the virus did not die out. I plotted the survival rate for various values of $p$ and with 2000 runs (each with a maximum of 100 iterations) each:

enter image description here

Code

My bare-bones (without plots) Python simulation code is:

import numpy as np
from scipy.stats import nbinom
    
RUNS = 2000
RUN_ITERS = 100
NR_MAX = 6e10
p=0.4

for _ in range(RUNS):
   nr=1
   nrs=[nr]
   for i in range(RUN_ITERS):
      #nbinom is the sum of iid geometric distributions
      nr = nbinom.rvs(nr,p)
      nrs.append(nr)
      nr = min(nr,NR_MAX) #Avoid overflowing
      if (nr==0): break #Terminate if noone left
   nrs_values.append(nrs)

Questions

I have multiple questions:

  1. Is the probability distribution over the number of iterations before termination a valid distribution? Can the domain of a probability distribution contain 'infinity'?
  2. Does the probability that a virus does not die out converge to a number? Does an algebraic formula for this number exist?

I can't find any resources online that deal with such a problem. This question considers the scenario with a limited number of people, but I am interested in the scenario with unlimited people. This question deals with the scenario where the number of infected people does not diverge.

1

There are 1 best solutions below

2
On BEST ANSWER

Basically you are correct. If $p<0.5$ i.e. if the expectation is at least $1$, then (iff) there is non-zero ppty that the infection wont end. This is a well known type of processes, called (linear) Galton-Watson process (with geometric distribution). There are many articles about that, many lectures notes... I send some here, but maybe googleing will get you more comfortable notes. The result is unfortunately not pretty number as you would like, at least not for geometric distribution if I am not mistaken. Hope that helps :)

https://www.researchgate.net/publication/38350532_The_age_of_a_Galton-Watson_population_with_a_geometric_offspring_distribution

https://www.uni-muenster.de/Stochastik/lehre/WS1011/SpezielleStochastischeProzesse/Ch_1.pdf