Getting useful distributions out of distributions with unknown probabilities

118 Views Asked by At

Suppose I have biased coins that all land heads with probability $p\in(0,1)$. Then I can simulate a fair coin by flipping a pair of biased coins until I get heads-tails or tails-heads. I associate to each respectively heads and tails. Since the probability of either is $p(1-p)$, I have extracted a fair coin out of a pair of biased ones.

This is a well-known trick that is often introduced to undergraduates in probability. What I want to know is: Are there other ways to extract "fair" distributions out of "biased" ones?

Two specific examples of what I am asking:

  1. I have a countable infinite sequence of i.i.d. Poisson random variables $X_i$ with mean $\lambda\in\mathbb R_{>0}$. Can I simulate a Poisson random variable $X$ with mean $1$?

  2. I have an uncountable collection of i.i.d. normal random variables $X_t$ with mean $\mu\in\mathbb R$ and variance $\sigma^2\in\mathbb R_{>0}$. Can I simulate a normal random variable $X$ with mean $0$ and variance $1$?

I am not sure if this is important, but I do not mind if the process to generate a "fair" distribution takes infinite time, so long as it converges in some reasonable sense.

2

There are 2 best solutions below

3
On BEST ANSWER

I'm not sure about the Poisson example, but when you have a random variable with a contionuous distribution, you can simulate any other distribution.

If $X$ has a continuous distribution with CDF $F$, then $F(X)$ is distributed uniformly in $[0, 1]$. From a uniform random variable you can generate any distribution via inverse transform sampling.

Specifically for your Gaussian example there's an even simpler method. Just note that for any Gaussian Random variable $X \sim \mathcal{N}(\mu, \sigma^2)$ we have $aX + b \sim \mathcal{N}(\mu + b, a^2 \sigma^2)$.

There is also a neat way to create any unfair coin from multiple throws of a fair coin. Assume you want to "simulate" a coin that shows heads with probability exactly $p$. To do this, flip the fair coin repeatedly until it shows heads for the first time and let $n$ be the number of coin flips you just did. If the $n$-th decimal place in the binary representation of $p$ is a 1, your unfair coin show sheads - otherwise tails.

Edit: The Gaussian problem is also solvable if $\mu$ and $\sigma^2$ are unknown. First of all note that $X_1 - X_2 \sim \mathcal{N}(0, 2\sigma^2)$, so we can create random variables with mean zero and unknown variance. Now for example we could calculate $\frac{X_1 - X_2}{X_3 - X_4}$, which has the same distribution as the quotient of two independent standard Gaussian variables (note the we have eliminated the need to know the variance by forming a quotient). In theory we could calculate its CDF and then apply the inverse transform sampling again.

More practically, we could use the fact that for independent random variables $X \sim \chi^2(m)$, $Y \sim \chi^2(n)$ we have $\frac{X}{X + Y} \sim \beta(m/2, n/2)$ (see here). Since the $\beta(1, 1)$-distribution is the uniform distribution on $[0, 1]$, we can use this to conclude $$\frac{(X_1 - X_2)^2 + (X_3 - X_4)^2}{(X_1 - X_2)^2 + (X_3 - X_4)^2 + (X_5 - X_6)^2 + (X_7 - X_8)^2} \sim U([0, 1]).$$ From here on, we can again use inverse transform sampling.

0
On

The trick of transforming random variables to "fair" ones works for any independent pair of absolutely continuous random variables X and Y, even if X and Y are dependent on each other or the two variables are not identically distributed, as long as the two variables are statistically indifferent (Montes Gutiérrez 2014, De Schuymer et al. 2003); equivalently, their probabilistic index (Acion et al. 2006) is 1/2 or their net benefit is 0. This means in our case that P(X < Y) = P(X > Y). In particular, two independent normal random variables X and Y are statistically indifferent if and only if they have the same mean (they remain so even if their standard deviations differ) (Montes Gutiérrez 2014).

In this case, if you have a "black box" to generate i.i.d. variates from an unknown distribution, you can proceed as follows (Morina et al. 2019):

  1. Draw two variates from the "black box", call them X and Y.
  2. If X is less than Y, output 1. If X is greater than Y, output 0. Otherwise, go to step 1.

Because X and Y are i.i.d., they are statistically indifferent, so that this method will output 1 or 0 with equal probability (see the appendix in my Note on Randomness Extraction).

This method generates i.i.d. random bits, so along with a method to generate uniform integers as well as a method to sample from the Poisson distribution with those integers, that solves example 1.

REFERENCES:

  • Montes Gutiérrez, I., "Comparison of alternatives under uncertainty and imprecision", doctoral thesis, Universidad de Oviedo, 2014.
  • De Schuymer, Bart, Hans De Meyer, and Bernard De Baets. "A fuzzy approach to stochastic dominance of random variables", in International Fuzzy Systems Association World Congress 2003.
  • Morina, G., Łatuszyński, K., et al., "From the Bernoulli Factory to a Dice Enterprise via Perfect Sampling of Markov Chains", arXiv:1912.09229 [math.PR], 2019.
  • Acion, Laura, John J. Peterson, Scott Temple, and Stephan Arndt. "Probabilistic index: an intuitive non‐parametric approach to measuring the size of treatment effects." Statistics in medicine 25, no. 4 (2006): 591-602.