Problem producing a Gaussian distribution on a negative range

40 Views Asked by At

I am attempting to produce a normal/Gaussian distribution from draws from a uniform distribution. The approach I am using (rejection method from this answer) is the following:

  • Let PDF(X) be the Probability density function of a random variable X on the range $[L_0,L_1]$.
  • Draw a random value $x_0$ from a uniform distribution on the range $[L_0,L_1]$.
  • Compute $P_0 = $ PDF($x_0$)
  • Draw a second random value $q$ from a uniform distribution on the range $[L_0,L_1]$.
  • if $q \leq P_0$, select $x_0$. Otherwise, reject $x_0$. (I answered my own question, the problem is here: this needs to be $ |q| \leq |P_0| $)

To plot the selected values, divide the range $[L_0,L_1]$ into $N$ points $x_i = i\Delta x + L_0 | i \in [0,N-1]$. Correspond with each point except the last one (i.e., not $i = N-1$) a bin $b_i | i \in [0,N-2] $ with count $c_i = 0$.

Let $x$ be in $b_i$ if $i\Delta x \leq x - L_0 < (i+1) \Delta x$, where $\Delta x = (L_1-L_0)/(N-1)$. If a given $x_0$ is selected, increase the count $c_i$ of its bin $b_i$ by 1 ($ count(b_i(x_0)) \equiv c_i(x_0) = c_i + 1$).

Make some number of selections.

Then, scale the resultant bin counts by the total area of the bins,

  • Scaled bin counts: $ s_i = c_i/A$
  • Histogram Area: $ A = \Delta x \sum_{i=0}^{N-2} c_i$

and plot the scaled bins $s_i$, where $x(s_i) = i\Delta x + L_0$.

This normalizes the histogram, such that total area of the scaled bins is 1 (I think? At least that's my impression of what's supposed to be happening), or tends to 1 as the number of bins goes to infinity and their width $\Delta x$ goes to 0.

I wrote this algorithm into python, and tested it on the exponential distribution, for which it seems to work:

  • PDF(X) $ = f(x;\lambda) = \lambda e^{-\lambda x}$ (Exponential Distribution)

exponential distribution

And the Gaussian distribution,

  • PDF(X) $ = f(x; \mu,\sigma) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2}$ (Gaussian Distribution)

for which it seems to mostly work on positive ranges ($0 \leq L_0$):

Gaussian distribution on [0 4]

But it fails for Gaussian distributions on negative ranges ($L_0 < 0 $). I have no idea why.

Gaussian Distribution on [-2,2]

Can anyone help me understand what I'm doing wrong, or missing?

Here is the python code:

import matplotlib
import numpy as np 
from numpy import e
from numpy import pi
import matplotlib.pyplot as plt
import random
import sys 

def map_range(x,x0,x1,y0,y1):
    return (x-x0)*((y1 - y0)/(x1 - x0)) + y0

def gaussian_PDF(x,mu,sigma):
    phi = lambda z: (1/np.sqrt(2*pi))*e**((z**2)/(-2)) # Standard/Unit normal distribution 
    return (1/sigma)*phi((x-mu)/sigma) # Scaled normal distribution 

N = 24

L0 = -2
L1 = 2

L = L1 - L0 

Dx = (L1-L0)/(N-1)

# roll = lambda: random.uniform(L0,L1)
roll = lambda: map_range(random.random(),0,1,L0,L1)

x_values = [i*Dx+L0 for i in range(N)]

mu = 0
sigma = 1 

y_values = [gaussian_PDF(x,mu,sigma) for x in x_values]

bin_count = [0 for _i in range(N-1)]

bin_index = lambda x: [i for i in range(N-1) if i*Dx <= x - L0 < (i+1)*Dx][0]


selections = 100000

for i in range(selections): 
    x0 = roll()
    
    P_x0 = gaussian_PDF(x0,mu,sigma)

    q = roll()

    if(q <= P_x0):
        bin_count[bin_index(x0)] += 1


areas = [count*Dx for count in bin_count]
hist_area = sum([count*Dx for count in bin_count])

scaled_bin_count = [count/hist_area for count in bin_count]

matplotlib.pyplot.close('all')

fig, ax = plt.subplots()

ax.plot(x_values, y_values)

ax.plot(x_values[0:N-1], scaled_bin_count,'o')

for i,x in enumerate(x_values[0:N-1]):
    ax.text(x,scaled_bin_count[i],bin_count[i])



ax.set(xlabel='x', ylabel='f(x)',title=f'Guassian Distribution on [{L0},{L1}]; {N-2} bins, {selections} selections')
ax.grid()
fig.savefig("gaussian_distribution.png")

EDIT: Ah well I seem to have found the problem. The problem occurs at the step $ q \leq P_0 $. It appears that this needs to be $ |q| \leq |P_0| $.

1

There are 1 best solutions below

0
On

The problem occurs at the step $ q \leq P_0 $. This needs to be $ |q| \leq |P_0| $.