Combinatorial proof calculating the expected minimum of uniform IID random variables (first order statistic)

230 Views Asked by At

Question: Suppose that $X_1, \ldots, X_n$ are IID uniform random variables in $[0, 1]$. Prove that $\mathbb{E}[ \min(X_1, \ldots, X_n) ] = \frac{1}{n+1}$. Can you prove it with a "combinatorial proof" (e.g., using a symmetry argument or similar)?

Context: I am aware of a simple proof that first argues about the CDF of $\min(X_1, \ldots, X_n)$ and then integrates it to access its expectation. While simple, I would like to find a proof of the fact without using any calculus (or calculations at all if possible) as such proofs tend to generalize more easily.

3

There are 3 best solutions below

1
On BEST ANSWER

This is a special case of Bayes billiards. Suppose you throw n (white) balls (independently uniformly at random) on [0,1] to get $X_1,...X_n$ and then throw another (red) ball (again independently uniformly at random). For any given outcome $X_1,...X_n$ the probability of that the red ball lands to the left of all the white ones is precisely $min(X_1,...X_n)$, and so over all random trials this probability is $E(min(X_1,...X_n))$. But you just threw n+1 balls i.i.d. on unit interval. The probability that any particular one landed to the left of all others is the same no matter which particular ball you consider, and is thus 1/(n+1).


You can also do this geometrically. Conditional on the outcomes coming out in a particular order, the distribution of $(X_1, ..., X_n)$ is uniform on one of the $n!$ simplexes into which the orderings of coordinates divide the n-dimensional hypercube. For each of these n-dimensional simplexes, the expectation of $min(X_1,..., X_n)$ is the volume of an (n+1) dimensional simplex - the (n+1) dimensional simplex with the original n-dimensional simplex as base and height over every point equal to $min(X_1,..., X_n)=X_i$, where $i$ is the index of the coordinate minimal on that particular n-simplex - divided by the n-dimensional volume of the base. In any case, such a n+1 simplex has height 1, and hence the conditional expectation is 1/(n+1) - and this is irrespective of the ordering of X_1,..., X_n. Thus overall the expectation of $min(X_1,..., X_n)$ is 1/(n+1) as well.


Here the extra dimension is serving a similar role to the extra red ball in Bayes' billiards; with a little extra effort, the two approaches can be made to correspond exactly (the point being that the orderings of X_1, X_n, X_red partition the (n+1)-dimensional hypercube into (n+1)! isomorphic simplexes, and we are looking at the volume of n! of them, the ones where X_red is the smallest coordinate, which is 1/(n+1) of the total).

3
On

The points $X_1,\dots,X_n$ split $[0,1]$ into $n+1$ (almost surely) pieces; their lengths called spacings. These spacings are identically distributed$^*$ and add up to $1$, so the expectation of each of them is $1/(n+1)$.


$^*$This is hard to argue rigorously without computation. But maybe the following argument is convincing.

Let $X_{(1)}<\dots < X_{(n)}$ be the ordered values. Conditionally on $X_{(1)}$, the numbers $X_{(2)},\dots,X_{(n)}$ are randomly chosen (ordered) numbers from $[X_{(1)},1]$. Similarly, conditionally on $X_{(n)}$, the numbers $X_{(1)},\dots,X_{(n-1)}$ are randomly chosen (ordered) numbers from $[0,X_{(n)}]$. The lengths of these two intervals, $1-X_{(1)}$ and $X_{(n)}$, have the same distribution thanks to symmetry. As a result, any two consecutive spacings have the same distribution.

1
On

Here's an attempt to expand on the "symmetry" argument:

  • Given any random sample of $X_1, ..., X_n$, let $(X_{(1)}, ..., X_{(n)})$ be the ordered set, and let $I_j$ be the random variable representing the length of the $j^{th}$ interval: $$I_j = \begin{cases} X_{(j)}, & j = 1 \\ X_{(j)} - X_{(j-1)}, & 1 < j \le n \\ 1 - X_{(j-1)}, & j = n+1. \end{cases}$$

Then we can rewrite $E[\min(X_1, ..., X_n)] = E[I_1]$.

$\underline{\text{Claim}}$: $E[I_1] = E[I_2] = \dots = E[I_{n+1}]$, with the corollary that since $I_1 + \dots + I_{n+1} = 1$, the intervals have common expected length $E[I_j] = \frac{1}{n+1}$, and in particular, $\boxed{E[I_1] = \frac{1}{n+1}}$.

$\underline{\text{Proof (?)}}$: Consider any random sample $x_1, \dots, x_n$ of the random variables $X_1, \dots, X_n$. This partitions $[0, 1]$ into $n+1$ ordered segments with lengths $I_j = i_j$ as defined above. Now consider the set of all $(n+1)!$ permutations of the values $(i_1, ..., i_{n+1})$; each permutation $(\pi_k(i_1), \dots, \pi_k(i_n))$ implies another equally likely ordered partition of $[0, 1]$ / sample of $X_1, ..., X_n$. Call this set of $(n+1)!$ samples/partitions $A$; among this set, the expected length of the $j^{th}$ interval for any $j$ is $$E[I_j\mid A] = \sum_{x} P(I_j = x) \cdot x = \sum_{k = 1}^{n+1} \frac{n!}{(n+1)!} \cdot i_k = \frac{n!}{(n+1)!}\left(i_1 + ... + i_{n+1}\right) = \frac{1}{n+1}.$$

All that's left is to argue that $E[I_j] = \frac{1}{n+1}$ holds more generally over the sample space $X_1 \times \dots \times X_n$. But the "permutation sets" $A_i$ described above should partition the sample space (equivalence relation: $(a_1, \dots, a_{n+1}) \sim (b_1, \dots, b_{n+1})$ if $a$ is a permutation of $b$), so by the law of iterated expectations, $$E[I_j] = \sum P(A_i) E[I_j \mid A_i] = \sum P(A_i) \frac{1}{n+1} = \frac{1}{n+1} \sum P(A_i) = \frac{1}{n+1}.$$

  • $\underline{\text{Disclaimer}}$: Would appreciate if someone with more measure-theoretic knowledge could verify/rigorize this; my background is limited. However, this logic should at least work for the discrete case. Suppose we were sampling uniformly from the discrete set $\{0, $1/k$, $2/k$, ..., 1\}$ for integer $k$ and define variables $I_j$ the same way.

Now, it's certainly more likely that the sample of points is $\{1/k, 2/k, 3/k, 4/k, 5/k\}$ rather than $\{1/k, 1/k, 1/k, 1/k, 1/k\}$, since the former has more distinct permutations. However, that doesn't affect the core argument that inside each "permutation set", the expected values for each interval are all equal; all we need is that the sample space is partitioned, so that the probabilities $P(A_j)$ sum to $1$.