Two dependent uniform random variables

36 Views Asked by At

I have the following (very simple) probability problem which I cannot solve (I suspect that this question has already been asked, but I searched Google and Approach0 and could not find any such question, nor in the "similar questions" suggested list that appears when authoring a new question – please forgive me if this is a duplicate). The question is this

Let $n \in \mathbb{N}$, and $P \sim U(n)$ and $Q \sim U(P)$ be two random variables. What is the probability that $Q = q$ for some $q \in [n]$.

I thought the best way to solve this would be using the law of total probability;

$$ \begin{align} \Pr[Q = q] &= \sum_{1 \leq p \leq n} \Pr[Q = q | P = p] \times \Pr[P = p] \\ &= \sum_{q \leq p \leq n} \Pr[Q = q | P = p] \times \Pr[P = p] \\ &= \sum_{q \leq p \leq n} \frac{1}{p} \times \frac{1}{n} \end{align} $$

and I think this working is correct (to exclude all the terms in the sum for which $p$ is less than $q$ as if $q$ is greater than the chosen $p$ then the probability of obtaining it is zero) and the random variables are uniform. However, this seems like a very messy result, and I though that therefore my answer would be wrong.

I ran a simulation (this code)

import random

buckets = {}

for i in range(1, 100_000):
  x = random.randrange(1, 25_000)
  y = random.randrange(1, x+1)
  buckets[y] = buckets.get(y, 0) + 1;

for (key, value) in buckets.items():
  buckets[key] = value / 25_000


and the percentage of the time that $1$ appeared was 0.00164 whereas Wolfram states that

sum_(q=1)^25000 1/(25000 q)≈0.0004281546707447414931420002048228646545868

so my answer seems to be (maybe) an order of magnitude off.

Any help would be really appreciated.

2

There are 2 best solutions below

0
On BEST ANSWER

Your math is correct, but you have bugs in your Python code. You meant to write the code below, which gives results agreeing with Wolfram.

import random

buckets = {}

for i in range(1, 100_000):
    x = random.randrange(1, 25_001)
    y = random.randrange(1, x+1)
    buckets[y] = buckets.get(y, 0) + 1;

for (key, value) in buckets.items():
    buckets[key] = value / 100_000
0
On

Your correct $\Pr[Q = q]= \sum_{q \leq p \leq n} \frac{1}{p} \times \frac{1}{n}=\frac1n(H_n-H_{q-1}$) using harmonic numbers.

$H_n\approx \log_e(n)+\gamma+ \frac{1}{2n}$ with $\gamma \approx 0.5772156649$. For $q=1$, you would have $H_{q-1}=0$

so with $n=25000$ you would get $\Pr[Q = 1]\approx \frac{10.7}{25000}=0.00428.$