Why is this true? (uniform distributions)

Question

Why is this true? (uniform distributions)

388 Views Asked by Bumbble Comm At 06 Apr 2026 - 6:12

If $Z=F(X)$, then $Z$ has a uniform distribution on $[0,1]$.

I understand the proof using functions, but visually it doesn't make sense.

If $X$ is a normal distribution, then its cdf $F(X)$ looks like:

If we take $Z=F(X)$, $Z$ does not look like a uniform distribution (no flat top, its curved here). What am I missing?

Original Q&A

There are 2 best solutions below

Bumbble Comm On 21 Jul 2021 - 11:58

I assume that $F$ is the cumulative distribution function associated to the given random variable $X$.

In general, the claim that $Z = F\circ X$ is uniformly distributed over $[0,1]$ is false, unless we impose condition on $F$ (we require that $F$ is continuous).

The following is an obvious counter-example: Let $X$ be a constant random variable $X=0$. Then $F(x) = 0$ if $x<0$ and $F(x)=1$ if $x\geq 0$. In this case $F\circ X$ can only take values 0 or 1, so $F\circ X$ cannot be uniformly distributed over $[0,1]$.

**Bumbble Comm** · Accepted Answer

If you transform a random sample by the (continuous) CDF of its population, it becomes standard uniform.

First, I find the notation to be unfortunate and it may be confusing you. Let's say $U = F_X(X) \sim \mathsf{Unif}(0,1),$ for a continuous random variable $X.$

Second, let's investigate a couple of specific examples.

Example 1: Suppose $X = \mathsf{Beta}(2,2),$ with $f_X(x) = 6x(1-x),$ for $0<x<1,$ a parabola. Also, $F_X(x) = 3x^2 - 2x^3.$

We can use R statistical software to sample $n=500$ observations in vector x from $\mathsf{Beta}(2,2).$ in R, the density is denoted dbeta and the CDF is denoted pbeta (each with appropriate parameters).

set.seed(721)
x = rbeta(500, 2, 2)
u = pbeta(x, 2, 2)

The expression for u amounts to $u =3x^2 - 2x^3.$

A histogram (blue) provides a crude estimate of the corresponding density function (red), and an empirical CDF (ECDF) plot provides an estimate of the corresponding CDF.

An ECDF uses actual data, while a histogram uses binned data (with some loss of information). So ECDFs are ordinarily better estimates of CDFs, than are histograms estimates of densities. The number of observations $n = 500$ is a compromise to get histograms that are not too rough, while getting an ECDF that can (barely) be distinguished from its corresponding CDF.

As you suspected, the 'action' takes place in the ECDF plot. Below we show how nine particular points in x (out 0f 500) get transformed to appropriate values in 'u'.

Example 2: Here is a similar example in which we transform 1000 points 'x' from the distribution \mathsf{Norm}(\mu=100,\sigma=15) to standard uniform by using the CDF of this normal distribution.

set.seed(2021)
x = rnorm(500, 100, 15)
u = pnorm(x, 100, 15)

Note: In case you are interested in R code for the figures, here is the code for the first two. (The third uses minor modifications of the first.)

par(mfrow = c(1,3))
hdr1 = "BETA(2,2): Histogram and Density"
hist(x, prob=T, col="skyblue2", main=hdr1)
 curve(dbeta(x,2,2), add=T, col="red", lwd=2)
hdr2 = "BETA(2,2), ECDF and CDF"
plot(ecdf(x), col="blue", lty="dashed", main=hdr2)
 curve(pbeta(x,2,2), add=T, col="red")
 hdr3 = "UNIF(0,1): Histogram and Density"
 hist(u, prob=T, col="skyblue2", main=hdr3)
  curve(dunif(x), add=T, col="red", lwd=2)
par(mfrow=c(1,1))

sx = sort(x);  X = sx[seq(50,450, by=50)]
U = pbeta(X, 2,2)
plot(ecdf(x), col="blue", main="ECDF or Beta Sample")
for(i in 1:9) {
 lines(c(-.2, X[i], X[i]), c(U[i],U[i], 0), col="green2")
 }

[The R procedure curve requires the argument x, regardless of what is being plotted.]

Why is this true? (uniform distributions)

There are 2 best solutions below

Related Questions in STATISTICS

Related Questions in PROBABILITY-DISTRIBUTIONS

Related Questions in NORMAL-DISTRIBUTION

Related Questions in UNIFORM-DISTRIBUTION

Trending Questions

Popular # Hahtags

Popular Questions