How to Make a PDF 'Look' Uniform?

311 Views Asked by At

Let $X$ be a normally-distributed random variable with mean zero and variance $\sigma^2$: $X \sim N(0,\sigma^2)$. Let $Y$ be a mapping from $X$ onto the interval $(0,1)$ using the sigmoid function: $Y=\text{sig}(X)=\frac{1}{1+e^{-X}}$.

It can be shown that $Y$ has the following probability density function:

$ f_Y (y) = \frac{1}{y(1-y)} \frac{1}{\sqrt{2 \pi \sigma^2}} \exp{\left(-\frac{\left(\ln\frac{y}{1-y}\right)^2}{2\sigma^2}\right)} $

My question is: for what value of $\sigma$ is $f_y (y)$ the flattest? (i.e., is the most similar to a uniform distribution over $(0,1)$).

Is a closed-form solution to this problem possible?

PS: I am leaving the definition of "the flattest" open, because I am OK with any one that makes sense. For instance, a simple one could be the one that minimizes the integral of $\left[f_Y(y)-1\right]^2$ over $(0,1)$.

3

There are 3 best solutions below

5
On BEST ANSWER

Consider the Taylor series expansion for $f(x) = \frac{1}{1+e^{-x}}$:

$$f(x) \approx \frac12 + \frac{x}{4} - \frac{x^3}{48} + \frac{x^5}{480} + \cdots.$$

We see that aside from the constant term, only odd terms are in the expansion.

Now, let's look at the Wiener-Askey polynomial chaos representation of a uniform random variable using a standard normal random variable.

Let $z \sim U(0,1)$ and $\zeta \sim \mathcal{N}(0,1)$. Gaussian random variables belong to the Gauss-Hermite polynomial chaos. Let $\Phi_i(\zeta)$ represent the $i$th Hermite polynomial.

Then, we can say that

$$z = \sum_{i=0}^\infty z_i\Phi_i(\zeta).$$

The $z_i$ coefficients are deterministic, we can compute them using the Galerkin method

$$z_i = \frac{\left\langle z\Phi_i(\zeta)\right\rangle}{\left\langle \Phi_i^2\right\rangle} = \frac{1}{\left\langle\Phi_i^2\right\rangle} \int_{-\infty}^\infty z\Phi_i(\zeta) w(\zeta) d\zeta$$ where $w(\zeta)$ is the weighting function that comes from orthogonality of the polynomials. Of course, for the Hermite polynomials, the weighting function is (to within a scaling of the independent variable) the normal distribution PDF!

The presence of $z$ in the integrand is problematic, but we can get rid of it by casting $\zeta$ and $z$ to a uniform random variable $u$ using an inverse sample transform. I'll spare the details, unless you want, but in the end we get a series of coefficients.

As it turns out, when $i$ is even and greater than zero, the resulting integrand is an odd function! Therefore, we're left with a series of coefficients, and a unity term.

This unity term, when $i=0$, always represents the mean of the variable. Since the mean of $z$ is 0.5, then we expect that $z_0 = 0.5$.


Now, notice the Taylor expansion. The $x^0$ coefficient is 0.5, the $x^1$ coefficient is 0.25, etc.

Now, here are the coefficients for the Gauss-Hermite representation of a uniform distribution:

$$\{ 0.5, .282, 0, -0.0576, 0, 0.01934, \ldots\}.$$

(Note: these coefficients are for scaled Hermite polynomials such that $\langle \Phi_i\Phi_j \rangle = \delta_{ij}.$)

These are the coefficients of the polynomials, but if you sub them in, you will get something quite close to what shows up in the Taylor expansion.

The more terms you take in the truncated polynomial chaos expansion, the flatter you get. At about 12 terms, the result is almost perfectly uniform. So, $\sigma \sim \sqrt{2}/2$ seems like a pretty good choice!

0
On

Here is another "reasonable" approach. Since the median and the mean are fine automatically, let's agree some other parameter, say, the 25% percentile. We should have $P(X>\log 3)=\frac 14$ whence $P(Z>\frac{\log 3}{\sigma})=\frac 14$ where $Z$ is the standard normal. Now, looking at the Normal Distribution tables, we get $\sigma\approx 1.6$. It doesn't look too bad on the graph and I doubt that any higher precision makes much sense without an exact definition of flatness...

0
On

Since the standard deviation of a standard logistic distribution is $\pi/\sqrt{3} \approx 1.8138$, I would have guessed that it would have been a good answer.

But empirically something like $\sigma=1.6638$ seems to produce a cumulative distribution function close to that of a uniform distribution on the unit interval. Here "close" is the integral of the square of the vertical gap.

In the following graph, the green line uses $\sigma=\pi/\sqrt{3}$ and the blueline uses $\sigma=1.6638$, while the red line is straight, i.e. the cumulative distribution function of a uniform distribution on the unit interval.

enter image description here