I have this histogram of data...what is the most proper way to prepare it for consumption in a neural network? I know how to normalize/standardize other types of data, but I'm wondering what to do with this kind of distribution.
2026-03-26 12:05:05.1774526705
On
How to normalize this exponentially distributed data?
11.6k Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail At
2
There are 2 best solutions below
0
On
This might be a somewhat naive method, but it will certainly work. We will use the Probability Integral Transformation.
If $X$ is a continuous random variable with cdf $F_X(x)$, then the random variable $U = F_X(X) \sim \mathrm{Uniform} (0,1)$. Similarly, $X = F^{-1}(U) \sim F_X$.
Step 1, convert your data to Uniform.
Use maximum likelihood estimator $\lambda = \bar{X}$
\begin{equation*} u_i = 1 - e^{x_i/\lambda} \end{equation*}
Convert the Uniform data to Standard Normal data. Let $\Phi(z)$ be the standard normal CDF.
\begin{equation*} z_i = \Phi^{-1}(u_i) \end{equation*}
It works like so...

When normalising inputs to a Neural net, you want the numbers to be in a similar range across different inputs, so inputs which tend to have much larger absolute values don't dwarf the contributions from smaller ones. You need to preserve the Y values (frequencies) but can change the X scale with various transforms (potentially non-linear ones).
Here you want to "squash" the distribution along the x axis, so the larger X values are "squashed" more, and so the range of X is of the same order of magnitude as the other inputs, without destroying the frequency information.
Taking a log (base 2 or 10) of X is the obvious way to do that. If you use log10 your data will range from 0->2.4 or so.You can get away with this in your distribution because the lowest value is 1. It's trickier if your min value is zero or close to zero.
You may then want to do a further normalisation of subtracting the mean, and dividing by the standard deviation, so the variance is 1 - the most common "standard" nn normalisation technique.