Why do we like sticking random variables into their own distributions?

Question

Why do we like sticking random variables into their own distributions?

113 Views Asked by Bumbble Comm At 06 Apr 2026 - 1:20

Let $X$ be a random variable taking values in the set $S$. It has some distribution $f(s)$. Often in statistics, we are interested in the real valued random variable $f(X)$. Here are some examples:

The value ${\bf E}(\log \frac{1}{f(X)})$ is called entropy and is very important in information theory.
If we have a family of distributions $f(s,\theta)$ where $\theta \in \Theta$ is a parameter, then ${\bf E}[(d_{\theta} \log f(X,\theta))^2]$ is called the Fisher information metric which shows up in the Cramer-Rao bound and other places.

These examples are enough to convince me that $f(X)$ is extremely fundamental, but without foresight, $f(X)$ seems like a strange object to consider.

Is there any way to motivate $f(X)$, where $f$ is the distrubution of $X$ without saying things like "noisy channel coding theorem" or "Cramer-Rao bound"?

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Answer 1 · 2016-04-20 23:05:12

OK, based on the links in Bey's comments, it seems that there is not an obvious answer to this question.

I have a few ideas which I want to record.

First lets consider the case of a random variable $X \in S$ where $S$ is a finite set. Let $f : S \to \mathbb{R}$ be the distribution. If we write $ S = \{ s_1,\dots,s_n \}$ then we can write $f = (p_1,\dots,p_n)$. The distribution of $f(X)$ is $$ \sum p_i \delta_{p_i} $$ Notice that this distribution does not depend on how we label the elements of $S$, but it remembers all the probabilities. You can think of $X \rightsquigarrow f(X)$ as forgetting the way we "coordinatized" $S$, but remembering which probabilities occurred.

For continuous random variables things are more subtle, but you can still say the following. Let $M^d$ be a manifold and $X \in M$ a continuous random variable. The distribution of $X$ is some $d$-form $\eta \in \Gamma(\wedge^d T^*M)$ and $\eta(X)$ is a random variable taking values in the line bundle $\wedge^d T^*M$. The density function of $\eta(X)$ is the $d$-form $\eta$ supported on the subset $\eta(M) \subseteq \wedge^d T^* M$. Said, in another way, the distribution of $\eta(X)$ is the push forward of $\eta$ along itself.

Why do we like sticking random variables into their own distributions?

There are 1 best solutions below

Related Questions in PROBABILITY

Related Questions in STATISTICS

Related Questions in INFORMATION-THEORY

Trending Questions

Popular # Hahtags

Popular Questions