how many unique numbers will appear in simulation?

Question

how many unique numbers will appear in simulation?

89 Views Asked by Bumbble Comm At 09 Apr 2026 - 5:11

I am from ukraine and I start taking course on the computer programming. My classmate tells me about this website for interesting discussion and information about mathematics.

Today we learn about problem where there is 100 unique numbers. We pick a number, record this number and put this number back ... then repeat this process 100 times (i think it is called sampling with replacement).

We are interested in knowing on the average, for this experiment: how many unique numbers we will get, how many non-unique numbers we will get and how many numbers will be unselected.

Here is the python code we were given for this experiment - this experiment is now repeated 500 times :

import numpy as np

# Set the number of numbers, the number of samples, and the number of repetitions
n = 100
k = 100
reps = 500

# Initialize arrays to store the results
n_unique_arr = np.zeros(reps)
n_non_unique_arr = np.zeros(reps)
n_unselected_arr = np.zeros(reps)

# Repeat the sampling process reps times
for i in range(reps):
    # Generate k random samples with replacement from the sequence 1:n
    samples = np.random.choice(range(1, n+1), k, replace=True)
   
    # Calculate the number of unique numbers in the sample
    n_unique = len(np.unique(samples))
   
    # Calculate the number of non-unique numbers in the sample
    n_non_unique = k - n_unique
   
    # Calculate the number of unselected numbers
    n_unselected = n - n_unique
   
    # Store the results
    n_unique_arr[i] = n_unique
    n_non_unique_arr[i] = n_non_unique
    n_unselected_arr[i] = n_unselected

# Calculate the average number of unique, non-unique, and unselected numbers
avg_n_unique = np.mean(n_unique_arr)
avg_n_non_unique = np.mean(n_non_unique_arr)
avg_n_unselected = np.mean(n_unselected_arr)

# Print the results
print("Average number of unique numbers:", avg_n_unique)
print("Average number of non-unique numbers:", avg_n_non_unique)
print("Average number of unselected numbers:", avg_n_unselected)

Average number of unique numbers: 63.37
Average number of non-unique numbers: 36.63
Average number of unselected numbers: 36.63

In general, is there some mathematics formula to understand where this numbers come from? How I can understand how to calculate these numbers in general without simulation results? Maybe there is some mathematics formula to determine this?

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

Suppose we pick $n$ times uniformly randomly from among $n$ numbers (in our particular case, $n = 100$). The of times a given number is picked follows a binomial distribution, $\operatorname{B}(n, p)$, where in our case $p = \frac{1}{n}$: Explicitly, the probability that a given number is picked exactly $k$ times $$\Bbb P(k) = {n \choose k} \left(\frac{1}{n}\right)^k \left(1 - \frac{1}{n}\right)^{n - k} = {n \choose k} \frac{(n - 1)^{n - k}}{n^{n}} .$$ For $k = 0$, we have $$\Bbb P(0) = \left(1 - \frac{1}{n}\right)^n ,$$ and for $n = 100$, $$\Bbb P(0) = \left(1 - \frac{1}{100}\right)^{100}\approx 0.366032\ldots .$$ So, we expect that $100 \Bbb P(0) = \boxed{36.6032\ldots}$ of the $100$ numbers are not picked, in close agreement with your numerical simulation. Notice that for large $n$, $$\Bbb P(0) \approx \lim_{n \to \infty} \left(1 - \frac{1}{n}\right)^n = \frac{1}{e} = 0.367879\ldots,$$ which for $n = 100$ carries a relative error of about $1$ part in $200$. Asymptotically, $\left(1 - \frac{1}{n}\right)^n - \frac{1}{e} \in O\left(\frac{1}{n}\right)$.

Correspondingly, we expect that any given number has a probability $\Bbb P(k > 0) = 1 - \Bbb P(0) = 1 - \left(1 - \frac{1}{n}\right)^{n}$ of being picked at least once. For $n = 100$ this is $$\Bbb P(k > 0) = 0.633967\ldots,$$ so we expect that about $\boxed{63.3967\ldots}$ of the $100$ numbers are picked at least once. Again for large $n$, the probability that a given number is picked at least once is $\Bbb P(k > 0) \approx 1 - \frac{1}{e}$.

The probability that a chosen number is picked exactly once is $$\Bbb P(1) = \left(1 - \frac{1}{n}\right)^{n - 1}.$$ For $n = 100$, we have $$\Bbb P(1) = 0.369729\ldots ,$$ just slightly greater (by a factor of $\frac{100}{99}$) than the probability that a chosen number is not picked at all, hence we expect that $\boxed{36.9729\ldots}$ of the $100$ numbers will be picked once, again in close agreement with the simulation. Again for large $n$ we can approximate this probability by $$\lim_{n \to \infty} \left(1 - \frac{1}{100}\right)^{99} \approx \left(1 - \frac{1}{100}\right)^{100} \approx \frac{1}{e} .$$

When $n$ is large and $p$ is small, and the binomial distribution is well-approximated by the Poisson distribution $\operatorname{Pois}(np)$ of mean $\lambda := n p$, which is somewhat less cumbersome than the binomial distribution. This approximation yields $$\Bbb P(k) \approx \frac{e^{-\lambda} \lambda^k}{k!},$$ and for our case, wherein $\lambda = np = n \left(\frac{1}{n}\right) = 1$, this approximation simplifies to $$\Bbb P(k) \approx \frac{1}{k! e} .$$ Asymptotically, $\Bbb P(k) - \frac{1}{k! e} \in O\left(\frac{1}{n}\right)$. In particular, we see that $\Bbb P(k)$ depends only weakly on $n$ once $n$ is large.

So, for example, a given number has a probability $\Bbb P(2) \approx \frac{1}{2e} = 0.183939\ldots$ of being chosen exactly $2$ times, so we expect that $18.3939\ldots$ of the $100$ numbers to be chosen exactly $2$ times.

how many unique numbers will appear in simulation?

There are 1 best solutions below

Related Questions in PROBABILITY

Related Questions in POISSON-DISTRIBUTION

Related Questions in BINOMIAL-DISTRIBUTION

Related Questions in SIMULATION

Related Questions in DERANGEMENTS

Trending Questions

Popular # Hahtags

Popular Questions