Distribute a population size based on fractions using random number generator drand48()

171 Views Asked by At

I have a population size of say 5000 people. Every person belongs to either A, B, C or D category. I want to split the population as per a given fraction provided by user. for example, 99% of A, 0.4% of B, 0.3% of C and 0.3% of D ( total = 100% ). I just want to confirm whether my approach for this solution is correct or not. I use random numbers to do this. let,

fa = fraction of A, fb = fraction of B, fc = fraction of C and fd = fraction of D.

my algorithm is as follows: I write a function :-->

 function( returns a category type A,B,C or D)
 {
   double r = drand48();  // gives me random number between 0 and 1.0 (uniform dist)
   if( r < fa )
      return A;
   if( fa < r < (fa+fb) )
      return B;
   if( (fa+fb) < r < (fa+fb+fc) )
      return C;
   if( (fa+fb+fc) < r < (fa+fb+fc+fd) )
      return D;
 }

I just want to make sure, the population is distributed correctly as per the given fraction of each type provided.

I am not sure what category to tag this is.(sorry for the trouble)

2

There are 2 best solutions below

1
On BEST ANSWER

In principle your code is right, except that there is a minuscle probability that r might equal e.g. fa+fb exactly, in which case none of the ifs triggers. You can drop the <r part of the condition (if repeated comparison is valid in your programming language of choice at all) anyway. If it is given that fa+fb+fc+fd equals 1, you can also just return D if the third if fails. So much for potential coding problems.

For the distribution, you are aware that the resulting population will not be distributed completely according to the given proportions, but rather like a sample from an infinite population where the proportions are valid? For example with $5000$ as size and $0.003$ for D, the actual number of Ds produced need not be $15$, but can easily turn out as e.g. $11$ or $19$.

1
On

Your algorithm will give the distribution on average, it doesn't guarantee the distribution. I guess the following is enough: Set $a$, $b$, $c$, $d$ to the number of elements in each category. Generate a random number $x$ uniformly between 0 and 1. Now, if $x \le a / (a + b + c + d)$, add the element to group $A$ and decrease $a$ by one; and do the same for the other groups. This ensures that a group that is full won't get more members. My random-fu isn't up to proving that this will select a distribution uniformly, however.