Continuous uniform distribution -Proving set size

75 Views Asked by At

I'm aware that the title might be a bit off, I am unsure on how to describe this.

For $n\in \mathbb{N}$, define $n+1$ independent random variables $X_0, \ldots , X_n$ which are uniformly distributed over the interval $[0,1]$, We focus on this group:$$S=\{X_i|i\geq 1 , X_i<X_0\}$$ For all $0\leq k\leq n$, show that: $$P(|S|=k)=\int_0^1 {n\choose k}\cdot x^k\cdot (1-x)^{n-k}$$

I've reduced this to the following: $$P(|S|=k)=P(exactly \ k \ elements\ are\ bigger\ than\ X_0)$$ Due to independence, we can write: $$=P(X_0 <X_i)\cdot \ \ldots \ \cdot P(X_0<X_{i+k}) , i\in\{1, \ldots ,n-k\}$$

I'm stuck here, cant find how to calculate $P(X_0<X_i)$, which prevents me from proving the statement.

I'm aware that $n\choose k$ is because we're 'checking' every k-sized group out of the n available R.Vs,

$x^k%$ which translates to $P(X_0<X_{i\rightarrow (i+k)})$ and $(1-x)^{n-k}$ to 'disable' the other R.Vs from being bigger than $X_0$

2

There are 2 best solutions below

2
On

The random variables $Y_\ell=\mathbf{1}_{[0,X_0)}(X_\ell),\,1\leq \ell \leq n$ indicate if $X_\ell<X_0$. Then by total expectation: $$P(|S|=k)=E\bigg[P\bigg(\sum_{1\leq \ell \leq n}Y_\ell=k\bigg|X_0\bigg)\bigg]=\int_{[0,1]}\underbrace{\binom{n}{k}x^{k}(1-x)^{n-k}}_{P(\sum_{1\leq \ell \leq n}Y_\ell=k|X_0=x)}dx$$


To see that $P(\sum_{1\leq \ell \leq n}Y_\ell=k|X_0=x)=\binom{n}{k}x^{k}(1-x)^{n-k}$, recall that if $W,Z$ are independent, then $E[f(W,Z)|W]=E[f(w,Z)]|_{w=W}$ for admissible $f$ by this result. In our case, $Z=(X_1,...,X_n)$, $W=X_0$ and $f(w,z)=\mathbf{1}_{\{k\}}(\sum_{1\leq \ell \leq n}\mathbf{1}_{[0,w]}(z_\ell))$. So $$E\bigg[\mathbf{1}_{\{k\}}\bigg(\sum_{1\leq \ell \leq n}\mathbf{1}_{[0,x]}(X_\ell)\bigg)\bigg]=P\bigg(\sum_{1\leq \ell \leq n}\mathbf{1}_{[0,x]}(X_\ell)=k\bigg)=\binom{n}{k}x^k(1-x)^{n-k}$$ Because $\mathbf{1}_{[0,x]}(X_\ell)$ are independent Bernoulli rvs with probability of success $x$.

3
On

The answer I've been able to compute, using the following formula: $$ \mathbb{P}(X\in A)=\int_{-\infty}^{\infty}\mathbb{P}(X\in A |Y=y)f_Y(y)dy$$

We know that the boundaries are $[0,1]$.

Therefor, We're looking to find $\int_{0}^{1}\mathbb{P}(|S|=k|X_0=x)\cdot f_{X_0}(x)dx$

Since the R.Vs are independent, we're interested in $(\mathbb{P}(X_0>X_i))^k\cdot (\mathbb{P}(X_0\leq X_j))^{n-k}$ where $i,j$ dont really mean anything, since all RVs have similar PDF/CDFs.

Since $\mathbb{P}$ is calculated using the CDF, and we're looking to check every k-sized subset of $X_1, \ldots , X_n$, we'll multiply the above in $n\choose k$ to get $${n\choose k}\cdot (\mathbb{P}(X_0>X_i))^k\cdot (\mathbb{P}(X_0\leq X_j))^{n-k}$$

We can rewrite that as:$${n\choose k}\cdot (\mathbb{P}(x_0>X_i))^k\cdot (\mathbb{P}(x_0\leq X_j))^{n-k}\cdot f_{X_0}(x_0)$$ We note that $f_{X_0}(x_0)=\frac{1}{1-0}=1$, now we need to integrate in order to get the CDF, which gets us to:$$\int_{0}^{1}{n\choose k}\cdot (\mathbb{P}(x_0>X_i))^k\cdot (\mathbb{P}(x_0\leq X_j))^{n-k}dx_0$$ $\mathbb{P}(x_0>X_i) $ is denoted by $\frac{x-a}{b-a}=\frac{x}{1}=x$, and naturally the complement would be $\mathbb{P}(x_0\leq X_i)=1-\mathbb{P}(x>X_i)=1-x$. we plug these into the above equation, to recieve: $$\int_{0}^{1}{n\choose k}\cdot x^k\cdot (1-x)^{n-k}$$ Even though this answer was not thought of by myself, I tried my best to understand and implement this, if there are any mistakes in my work, please point them out. Thank you for all your assistance!