Why aren't these two random processes the same? How can I extend it to higher length arrays?

32 Views Asked by At

I want to create a random array of probabilities (all combinations equality likely). In the case of an array of length two, I can do the following (code in Python):

results = []
for _ in range(100000):
    results.append(np.random.random())
print(np.mean(results), np.std(results))
Output: 0.5010405458900358 0.2889116594183084

In this particular case, the variable assigned to the 1st element of the array has average 0.5 and standard deviation of around 0.288. I can assign the random number to the 2nd element of the array just by doing 1 - 1st (drawn from random uniform distribution).

However, I don't know how to do it for arrays of higher lengths. I tried to draw random numbers from a uniform distribution (same size as length of array) and divide by its sum. But that doesn't work. An example:

results = []
for _ in range(100000):
    arr = np.random.random(2)
    results.append(arr[0] / np.sum(arr))
print(np.mean(results, axis=0), np.std(results, axis=0))
Output: 0.49934817136203025 0.23833927694747173

Clearly, the two codes are not equivalent. Why? Why is that in the 2nd case, the standard deviation is lower than the first? (maybe the summation of two random uniform variables alter the distribution). But how can I extend the first case (first code example) to higher lengths?

Thank you everyone.