How can the CDF always be a uniform distribution if some values are more likely than others?

237 Views Asked by At

I can prove symbolically that given some random variable $X$ with a strictly increasing continuous CDF, $F$ that $Y = F(X)$ has a Uniform(0,1) distribution:

$$P(Y \leq y) = P(F(X) \leq y) = P(X \leq F^{-1}(y)) = F(F^{-1}(y)) = y$$

But I can't reconcile this with my intuition from looking at the plot of a PDF:

enter image description here

This PDF tells us that values around $x=a$ are less common than values around $x=b$. Values around $x=b$ will also correspond to higher values of the CDF. If we're more likely to plug $x=b$ into $F$ than $x=a$, then shouldn't $Y = F(X)$ not be uniform? Shouldn't $Y$ take on values around $F(b)$ more often than it takes on values around $F(a)$?

3

There are 3 best solutions below

0
On

From the start, I think you may be mixing up the CDF with the inverse CDF (also called the quantile function).

Suppose $X \sim \mathsf{Beta}(2,1),$ so that the density function is $f_X(x) = 2x,$ for $0 < x < 1$ (and $0$ otherwise). Then the CDF is $F_X(x) = x^2,$ for $0 < x < 1.$

Now let $U \sim \mathsf{UNIF}(0,1)$ and set $U = F_X(X)$ so that $F_X^{-1}(U) = \sqrt{U} = X.$ This result is widely used in simulation to sample realizations of $X.$ If one generates a random sample from $\mathsf{UNIF}(0,1)$ and takes square roots of the values, then one has a random sample from $\mathsf{Beta}(2,1).$

Here are histograms 20,000 realizations of $U \sim \mathsf{Unif}(0,1)$ and corresponding $X = \sqrt{U} \sim \mathsf{Beta}(2,1).$ Colors of objects and images are the same. Each histogram bar represents about 2000 values.

enter image description here

u = runif(20000);  x = sqrt(u)
par(mfrow=c(1,2))
 cutp.u = seq(0, 1, by=.1)
 hist(u, prob=T, ylim=c(0,2), br=cutp.u, col=rainbow(12), main="UNIF(0,1)")
  curve(dunif(x), add=T, n=10001, lwd=2)
 cutp.x = sqrt(cutp.u)
 hist(x, prob=T, ylim=c(0,2), br=cutp.x, col=rainbow(12), main="BETA(2,1)")
  curve(dbeta(x,2,1), add=T, n=10001, lwd=2)
par(mfrow=c(1,1))
4
On

You're forgetting the chain rule. If $X=g(Y)$, $Y$ has pdf $f(g(y))|g'(y)|$, whose graph differs in shape from that of $f$ due to the $g'$ factor. When $g=F^{-1}$, this exactly cancels the $f$-dependence.

0
On

This might help with the intuition ...

the action of F on X

Consider $Y=F(X)$ where $F$ is the CDF of $X$, i.e. $F(x)=P(X\le x)$. The action of $F$ on $X$ can be seen as moving the probability distributed on the $x$-axis and re-distributing it on the $y$-axis. Notice that as it does so, if $A$ and $B$ are equal-length $x$-intervals with $P(X\in A)<P(X\in B)$, then $A$ gets mapped into a shorter interval than does $B$ -- i.e., the re-distribution spreads out the probability on the $y$-axis more just where it's more concentrated on the $x$-axis. An analytical proof shows that in fact this effect results in a perfectly uniform re-distribution of the probability.