Random sample generated for i.i.d variables

179 Views Asked by At

My attempt: Since $f(y;a) = \frac{1}{2a}exp(\frac{-y}{2a})$ for $y>0$, and $f(y;a) = \frac{1}{2a}exp(\frac{y}{2a})$ for $y<0$, it is easy to see $A_n$ is a sufficient statistics for a family $T$ of measure $\left\{f(y;a): x, a > 0\right\}$. This means when generating a sample that is equivalent to $Y_1, Y_2,\ldots Y_n$, we only care about the information of $\sum_{i=1}^{n} S_1+\ldots + S_n = A_n$.

1

There are 1 best solutions below

11
On BEST ANSWER

Your last question is answered by saz in the comments.

Here is one approach to the problem. It's not really related to the hints though.

I take the problem to be sample from the conditional distribution of $Y_1,\dots,Y_n,$ given the sum $A_i=\sum_{i=1}^nY_i.$ I will treat this as a reasonable practical problem, where integrating an $n-1$-dimensional conditional pdf is too slow.

First recall (or read on Wikipedia) that $Y_i$ has the same distribution as $X_i-X'_i$ where $X_i,X'_i$ are independent $\mathrm{Exp}(1/a)$ variables. And $B_n=\sum_{i=1}^n X_i$ and $B'_n=\sum_{i=1}^n X'_i$ have independent $\mathrm{Erlang}(n, 1/a)$ distribution, also known as a gamma distribution.

Since $A_n=B_n-B'_n,$ we can think of $A_n$ as a difference of independent $\mathrm{Erlang}(n, 1/a)$ distributed variables. The conditional pdf of $B_n$ is given by

$$f_{B_n|A_n}(x)\propto x^{n-1}e^{-x/a} \cdot (x-A_n)^{n-1}e^{-(x-A_n)/a} $$

for $x\geq A_n.$

This is not a common distribution. One way to sample from it would be to use the binomial expansion

$$x^{n-1}(x-A_n)^{n-1}=(x-A_n+A_n)^{n-1}(x-A_n)^{n-1}=\sum_{j=0}^{n-1}\binom{n-1}{j}(x-A_n)^{n+j}A_n^{n-1-j}$$

which reduces to sampling from a mix of Erlang (i.e. gamma) distributions. (More explanation below)

We can then sample $X_1,\dots,X_n$ conditioned on $B_i,$ which is much easier (see below). Similarly for $X'_1,\dots,X'_n.$ The differences $Y_i=X_i-X'_i$ will then have the correct distribution conditioned on $A_i.$

Sampling from i.i.d. exponentials conditioned on their sum

As mentioned in the comments, the joint pdf of $X_1,…,X_N$ conditioned on their sum $B_i$ is uniform on $\{X_1+\cdots+X_n=B_i\}$. This uniform distribution is homogeneous in $B_i$ - it doesn't depend on $B_i$ except for scaling - so you can just sample i.i.d exponential variables $X_1+\cdots+X_n$ and scale them to have sum $B_i$.

Sampling from a mixture of distributions

It is possible to efficiently sample from a pdf of the form

$$f(x) \propto \sum_{j=0}^{n-1} f_j(x)$$

as long as the integrals $Z_j=\int f_j(x)$ can be computed efficiently. For the case in question $Z_j$ can be expressed in terms of factorials and simple expressions of $a$ and $A_n.$ The marginal pdf of $j$ is given by $Z_j/(Z_0+\dots+Z_{n-1}).$ This involves sampling from a custom pdf, but it's just $n$ different values so can be computed quite quickly. Given $j$, the condition distribution of $x$ is given by $f(x\mid j)=f_j(x),$ e.g. a gamma distribution.