Explain the following behavior

92 Views Asked by At

I generated a list of 8000000 random numbers in python between 1 and 100000000000000.

import random

random_numbers = random.sample(range(1, 100000000000000), 8000000)

#Initialize empty list of random zeron elements
counters = [0] * 10

for item in random_numbers:
    first_digit = str(item)[0] #get first digit of element
    counters[int(first_digit)] += 1 #Increase corresponding counter

for i, c in enumerate(counters):
    print("Number of elements starting with %d is %d" % (i, c))

My result was

Number of elements starting with 0 is 0
Number of elements starting with 1 is 890472
Number of elements starting with 2 is 889404
Number of elements starting with 3 is 887623
Number of elements starting with 4 is 888416
Number of elements starting with 5 is 889126
Number of elements starting with 6 is 888614
Number of elements starting with 7 is 889199
Number of elements starting with 8 is 888683
Number of elements starting with 9 is 888463

It is a little bit weird, because that is not respect the benford's law. Could anyone be able to explain this.

Thanks in advance!

1

There are 1 best solutions below

1
On BEST ANSWER

You are looking at numbers in one decade. Benford's Law applies to numbers which occupy a large number of decades:

From the Wikipedia article:

It tends to be most accurate when values are distributed across multiple orders of magnitude.

Furthermore, your data was generated to have a uniform distribution. This does not represent a natural, scale-independent distribution of numbers.