Histogram normalization

1.1k Views Asked by At

I need to generate random numbers from Gaussian distribution and to draw an equalized histogram. I've generated them in Matlab using Box-Muller transformation. Since I wasn't sure how to equalize the histogram, I was searching for a solution on internet and I found that the right way to equalize it is (in Matlab code):

Vnorm= V / (max(V) * sqrt(2*pi));

where V is row vector of random numbers genrated from Gaussian distribution. Why is it devided by max(V)*sqrt(2*pi)? Is it the area covered by histogram (non-normalized)?

The complete code is the one on this link (I don't know how to write it properly here so I put in on google drive).

1

There are 1 best solutions below

3
On BEST ANSWER

The Box–Muller transform is a numerical method to generate standard-normally distributed random numbers, i.e. realizations of $N(0,1)$.

The density of the standard normal distribution is $$ f(x) = \frac{1}{\sqrt{2\pi}} e^{ -\frac{x^2}{2} }. $$ This density reaches its maximal value of $1 / \sqrt{2 \pi}$ at $x = 0$.

So it appears that whoever gave you that rule intends to scale the histogram from its empirical maximal value max(V) to the maximal density 1 / sqrt(2*pi).


The difference between the density and the histogram is that the former is a continuous function whose integral over the whole range of values is 1 (it is normalized), while the latter is a sequence of counts. The above procedure tries to turn the histogram into a density estimate by adjusting it based on the empirical maximum. That procedure assumes that the correct distribution is known, and even if that's true it is not necessarily optimal (a maximum is sensitive to outliers).

A better procedure is to turn the histogram into a density estimate in a distribution-independent way, by numerically replicating the normalization process:

Vnorm = V / (sum(V) * dV)

where dV is the histogram bin width. Here sum(V) * dV is the discrete analogon of the integral, and the histogram is normalized by scaling it such that the "integral" becomes 1.