I am attempting to produce a normal/Gaussian distribution from draws from a uniform distribution. The approach I am using (rejection method from this answer) is the following:
- Let PDF(X) be the Probability density function of a random variable X on the range $[L_0,L_1]$.
- Draw a random value $x_0$ from a uniform distribution on the range $[L_0,L_1]$.
- Compute $P_0 = $ PDF($x_0$)
- Draw a second random value $q$ from a uniform distribution on the range $[L_0,L_1]$.
- if $q \leq P_0$, select $x_0$. Otherwise, reject $x_0$. (I answered my own question, the problem is here: this needs to be $ |q| \leq |P_0| $)
To plot the selected values, divide the range $[L_0,L_1]$ into $N$ points $x_i = i\Delta x + L_0 | i \in [0,N-1]$. Correspond with each point except the last one (i.e., not $i = N-1$) a bin $b_i | i \in [0,N-2] $ with count $c_i = 0$.
Let $x$ be in $b_i$ if $i\Delta x \leq x - L_0 < (i+1) \Delta x$, where $\Delta x = (L_1-L_0)/(N-1)$. If a given $x_0$ is selected, increase the count $c_i$ of its bin $b_i$ by 1 ($ count(b_i(x_0)) \equiv c_i(x_0) = c_i + 1$).
Make some number of selections.
Then, scale the resultant bin counts by the total area of the bins,
- Scaled bin counts: $ s_i = c_i/A$
- Histogram Area: $ A = \Delta x \sum_{i=0}^{N-2} c_i$
and plot the scaled bins $s_i$, where $x(s_i) = i\Delta x + L_0$.
This normalizes the histogram, such that total area of the scaled bins is 1 (I think? At least that's my impression of what's supposed to be happening), or tends to 1 as the number of bins goes to infinity and their width $\Delta x$ goes to 0.
I wrote this algorithm into python, and tested it on the exponential distribution, for which it seems to work:
- PDF(X) $ = f(x;\lambda) = \lambda e^{-\lambda x}$ (Exponential Distribution)
And the Gaussian distribution,
- PDF(X) $ = f(x; \mu,\sigma) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2}$ (Gaussian Distribution)
for which it seems to mostly work on positive ranges ($0 \leq L_0$):
But it fails for Gaussian distributions on negative ranges ($L_0 < 0 $). I have no idea why.
Can anyone help me understand what I'm doing wrong, or missing?
Here is the python code:
import matplotlib
import numpy as np
from numpy import e
from numpy import pi
import matplotlib.pyplot as plt
import random
import sys
def map_range(x,x0,x1,y0,y1):
return (x-x0)*((y1 - y0)/(x1 - x0)) + y0
def gaussian_PDF(x,mu,sigma):
phi = lambda z: (1/np.sqrt(2*pi))*e**((z**2)/(-2)) # Standard/Unit normal distribution
return (1/sigma)*phi((x-mu)/sigma) # Scaled normal distribution
N = 24
L0 = -2
L1 = 2
L = L1 - L0
Dx = (L1-L0)/(N-1)
# roll = lambda: random.uniform(L0,L1)
roll = lambda: map_range(random.random(),0,1,L0,L1)
x_values = [i*Dx+L0 for i in range(N)]
mu = 0
sigma = 1
y_values = [gaussian_PDF(x,mu,sigma) for x in x_values]
bin_count = [0 for _i in range(N-1)]
bin_index = lambda x: [i for i in range(N-1) if i*Dx <= x - L0 < (i+1)*Dx][0]
selections = 100000
for i in range(selections):
x0 = roll()
P_x0 = gaussian_PDF(x0,mu,sigma)
q = roll()
if(q <= P_x0):
bin_count[bin_index(x0)] += 1
areas = [count*Dx for count in bin_count]
hist_area = sum([count*Dx for count in bin_count])
scaled_bin_count = [count/hist_area for count in bin_count]
matplotlib.pyplot.close('all')
fig, ax = plt.subplots()
ax.plot(x_values, y_values)
ax.plot(x_values[0:N-1], scaled_bin_count,'o')
for i,x in enumerate(x_values[0:N-1]):
ax.text(x,scaled_bin_count[i],bin_count[i])
ax.set(xlabel='x', ylabel='f(x)',title=f'Guassian Distribution on [{L0},{L1}]; {N-2} bins, {selections} selections')
ax.grid()
fig.savefig("gaussian_distribution.png")
EDIT: Ah well I seem to have found the problem. The problem occurs at the step $ q \leq P_0 $. It appears that this needs to be $ |q| \leq |P_0| $.

![Gaussian distribution on [0 4]](https://i.stack.imgur.com/Z8xub.png)
![Gaussian Distribution on [-2,2]](https://i.stack.imgur.com/c2LrO.png)
The problem occurs at the step $ q \leq P_0 $. This needs to be $ |q| \leq |P_0| $.