Is there an explicit probability density function for $$\frac{x_1}{\sum_{i=1}^{n} x_i}$$ where $x_i$ are independent random variables and have a uniform probability density between 0 and 1.
Is there an explicit probability density function for $\frac{x_1}{\sum_{i=1}^{n}x_i}$ where $x_i \sim U(0,1)$ and are independent random variables
331 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 2 best solutions below
On
An explicit CDF was already worked out in the other answer, but I get the feeling you may be more interested in getting a good and simple approximation when $n$ is large. For this purpose, consider the random variable $Y$ which is the sum of $n-1$ iid uniform $[0,1]$ random variables. Then $Y$ has mean $\frac{n-1}{2}$ and variance $\frac{n-1}{12}$. You can get a decent approximation for $Y$ when $n$ is large by replacing $Y$ with a normal random variable $Z$ with the same mean and variance. Then, the distribution of this approximating distribution is obtained by computing the probability $$ \mathbb P\bigl(\frac{U}{U+Z}<t\bigr), $$ which can be computed explicitly.
Indeed, we can start by computing the conditional probability $$ \mathbb P\bigl(\frac{U}{U+Z}<t\mid Z\bigr),$$ which equals $0$ if $Z$ is negative (which happens with tiny probability when $n$ is large). Thus $$\mathbb P\bigl(\frac{U}{U+Z}<t\mid Z\bigr)=\frac{t}{1-t} Z\cdot 1[Z>0]. $$ Finally, taking the expectation results in $$ \mathbb P\bigl(\frac{U}{U+Z}<t\bigr)=\frac{t}{1-t}\mathbb E[Z;Z>0]. $$
If you want to make the $\mathbb E[Z;Z>0]$ more explicit, note that if we set $m=n-1$ for convenience of notation, then $$ \mathbb E[Z;Z>0]=\frac{1}{\sqrt{2\pi}}\int_{0}^{\infty}xe^{-6(x-m/2)^2/m}\ dx $$ is a constant depending only on $m$ (or equivalently, $n$).
Since $$ \frac{x_1}{\sum_{i=1}^{n} x_i} \le y \iff \left(y-1\right)x_1 + y\sum_\limits{i=2}^n x_i \ge 0\ ,$$ then $$ \mathbb{P}\left(\frac{x_1}{\sum_{i=1}^{n} x_i} \le y \right) = \mathbb{P}\left(\left(y-1\right)x_1 + y\sum_\limits{i=2}^n x_i \ge 0\right)\ , $$ and if the joint distribution of $\ x_1, x_2, \dots, x_n\ $ is known, the probability on the right side of the above equation can, in theory at least, be calculated from it.
If $\ x_1, x_2, \dots, x_n\ $ are independent, then the distribution of $\ \left(y-1\right)x_1 + y\sum_\limits{i=2}^n x_i\ $ is the convolution of those of $\ \left(y-1\right)x_1, y\,x_2, y\,x_3, \dots,\mathrm{\ and\ } x_n\ $. The random variable $\ \left(y-1\right)x_1\ $ is uniformly distributed over the interval $\ [0, y-1]\ $ if $\ y> 1\ $, or over the interval $\ [y-1, 0]\ $ if $\ y<1\ $, while $\ y\,x_i\ $ is uniformly distributed over the interval $\ [0,y]\ $ for every $\ i\ $ whenever $\ y>0\ $.
So the short answer to your question is "yes", although in the case when $\ x_1, x_2, \dots, x_n\ $ are independent, the explicit expression for the distribution function of $\ \left(y-1\right)x_1 + y\sum_\limits{i=2}^n x_i\ $ will be a rather complicated spline, which I have no inclination whatever to try and calculate.
Addendum
It turns out that calculating an explicit expression for the distribution is not quite as tedious as I had thought. As I might have guessed, the most tedious part—calculating the convolution of many uniform distributions—has already been done, and there is a reasonably concise expression for it.
It is obvious that $ \mathbb{P}\left(\frac{x_1}{\sum_{i=1}^{n} x_i} \le y \right)= 0\ $ if $\ y \le 0\ $ and $ \mathbb{P}\left(\frac{x_1}{\sum_{i=1}^{n} x_i} \le y \right)= 1\ $ if $\ y \ge 1\ $, so to specify the entire distribution function of $\ \frac{x_1}{\sum_{i=1}^{n} x_i}\ $ we only need to determine its values for $\ 0<y<1\ $. We have \begin{eqnarray} \mathbb{P}\left(\frac{x_1}{\sum_{i=1}^{n} x_i} \le y \right) &=& \mathbb{P}\left(\sum_{i=2}^{n} x_i\ge \frac{(1-y)x_1}{y}\right)\\ &=& 1-\mathbb{P}\left(\sum_{i=2}^{n} x_i\le \frac{(1-y)x_1}{y}\right)\\ &=&1-\int_\limits{0}^1\mathbb{P}\left(\left.\sum_{i=2}^{n} x_i\le \frac{(1-y)x}{y}\right\vert x_1=x\right)dx\\ &=& 1-\int_\limits{0}^1\mathbb{P}\left(\sum_{i=2}^{n} x_i\le \frac{(1-y)x}{y}\right)dx\ , \end{eqnarray} from the independence of the $\ x_i\ $ and the fact that $\ x_1 \sim U(0,1)\ $. From the above link, we have \begin{eqnarray} \mathbb{P}\left(\sum_\limits{i=2}^{n} x_i\le \frac{(1-y)x}{y}\right)&=&\\ &&\hspace{-4em}\frac{1}{(n-1)!}\sum_\limits{k=0}^{n-2}(-1)^k{n-1\choose k}\left(\frac{(1-y)x}{y}-k\right)_+^{n-1}\ , \end{eqnarray} provided $\ \frac{(1-y)x}{y}\le n-1\ $, and where $\ \left(X\right)_+= \max(0,X)\ $. If $\ \frac{(1-y)x}{y}> n-1\ $ instead, then $\ \mathbb{P}\left(\sum_\limits{i=2}^{n} x_i\le \frac{(1-y)x}{y}\right)=1\ $. If $\ 1\ge y \ge \frac{1}{n}\ $, then $\ \frac{(1-y)x}{y}\le n-1\ $ for all $\ x\in \left[0,1\right]\ $, so for those values of $\ y\ $, we have \begin{eqnarray} \int_\limits{0}^1\mathbb{P}\left(\sum_{i=2}^{n} x_i\le \frac{(1-y)x}{y}\right)dx &&\\ && \hspace{-8em}=\frac{1}{(n-1)!}\sum_\limits{k=0}^{n-2}(-1)^k{n-1\choose k}\int_\limits{0}^{\ 1}\left(\frac{(1-y)x}{y}-k\right)_+^{n-1}dx\\ && \hspace{-8em}=\frac{y}{(n-1)!(1-y)}\sum_\limits{k=0}^{n-2}(-1)^k{n-1\choose k}\int_\limits{0}^\frac{1-y}{y}\left(z-k\right)_+^{n-1}dz \\ && \hspace{-8em}=\frac{y}{n!(1-y)}\sum_\limits{k=0}^{n-2}(-1)^k{n-1\choose k}\left(\frac{(1-y)}{y}-k\right)_+^n\ . \end{eqnarray} On the other hand, if $\ 0<y<\frac{1}{n}\ $ then the integrand in the expression on the left of the above series of equations is identically $1$ for $x\ge\frac{(n-1)y}{1-y}\ $, so we get \begin{eqnarray} \int_\limits{0}^1\mathbb{P}\left(\sum_{i=2}^{n} x_i\le \frac{(1-y)x}{y}\right)dx &&\\ &&\hspace{-8em}= \int_\limits{0}^\frac{(n-1)y}{1-y}\mathbb{P}\left(\sum_{i=2}^{n} x_i\le \frac{(1-y)x}{y}\right)dx + 1-\frac{(n-1)y}{1-y}\\ &&\hspace{-8em}=\frac{1}{(n-1)!}\sum_\limits{k=0}^{n-2}(-1)^k{n-1\choose k}\int_\limits{0}^\frac{(n-1)y}{1-y}\left(\frac{(1-y)x}{y}-k\right)_+^{n-1}dx\\ &&\hspace{-6em}+\frac{1-ny}{1-y}\\ &&\hspace{-8em}=\frac{y}{(n-1)!(1-y)}\sum_\limits{k=0}^{n-2}(-1)^k{n-1\choose k}\int_\limits{0}^{n-1}\left(z-k\right)_+^{n-1}dz\\ &&\hspace{-6em}+\frac{1-ny}{1-y}\\ &&\hspace{-8em}=\frac{1-ny}{1-y}+\frac{y}{n!(1-y)}\sum_\limits{k=0}^{n-2}(-1)^k{n-1\choose k}\left(n-1-k\right)^n\ . \end{eqnarray} Putting all this together, we have \begin{eqnarray} \mathbb{P}\left(\frac{x_1}{\sum_{i=1}^{n} x_i} \le y \right) &&\\ &&\hspace{-5em}=\cases{1-\frac{1-ny}{1-y}-\frac{y}{n!(1-y)}\sum_\limits{k=0}^{n-2}(-1)^k{n-1\choose k}\left(n-1-k\right)^n& for $\ 0\le y\le \frac{1}{n}$\\ 1-\frac{y}{n!(1-y)}\sum_\limits{k=0}^{n-2}(-1)^k{n-1\choose k}\left(\frac{(1-y)}{y}-k\right)_+^n& for $\ \frac{1}{n}<y\le 1\ $.} \end{eqnarray} This distribution function is continuous, and differentiable everywhere except at the point $\ y=0\ $, and possibly the points $\ y=\frac{1}{k}\ $ for $\ k=1,2,\dots, n\ $, so its density function can be obtained by differentiating it.