"Self-referential" probability mass functions

135 Views Asked by At

I am currently self-studying information theory from "Quantum Information Theory" by Mark M. Wilde. He uses a kind of notation that I don't understand at all. I will explain the problem using quotations from the book:

Let $p_X(x)$ be the probability mass function associated with random variable $X$, so that the probability of realization $x$ is $p_X(x)$...

So far, so good. The random variable $X$ can produce different numbers, and if it produces (say) $0.5$ with probability $0.25$, then $p_X(0.5)=0.25$.

I start to fall to pieces a little latter, when the author begins to write things like $p_X(X)$. I'm not sure how to read this notation. It seems to be saying that the random variable has itself as an output? The author goes on to write:

There is nothing wrong mathematically here with having a random variable $X$ as the argument to the density function $p_X$, though this expression may seem self-referential at first.

And later, when introducing the same notation again:

It may seem strange at first glance that $X$, the argument of the probability mass function $p_X$ is itself a random variable, but this type of expression is perfectly well-defined mathematically.

But how is it defined? That's my question. I'm not looking for a rigorous answer, but a wordy explanation of how I can read/interpret such an expression, and maybe a simple example, would be truly appreciated.

3

There are 3 best solutions below

0
On BEST ANSWER

The important thing to note is that $p_X$ depends on $X$ only through its distribution. So $p_X$ is just a function, and as with any* function, like $f(x) = x^2 +3$, say, we can make sense of a function of a random variable, so that $f(X) = X^2 + 3$ is also a random variable, and so is $p_X(X)$.

Your confusion might arise because the quantity $X$ appears in the notation twice. Something that may make what's going on a bit clearer is to say let $\widetilde{X}$ be another random variable with the same distribution as $X$. Then this new random variable's density, $p_{\widetilde{X}}(x)$, must be the same function as $p_X(x)$, since $X$ and $\widetilde{X}$ have the same distribution. Since $p_{\widetilde{X}}$ is just a function, we can make sense of $p_{\widetilde{X}}(X)$. But this is just the same as $p_X(X)$.

*modulo measure-theoretic requirements - if you haven't come across measure theory, don't worry about this.

0
On

This is not an answer but an exploration.


The author claims that $p_X(X)$ is 'well-defined'.

Let's look at an example. Let's say I have $X\sim\text{Geom}(p)$ on $\{1,2,3,\dotsc\}$.

Then $$p_X(k) = (1-p)^{k-1}p.$$ I guess you could say this is just a function. And (recall) you could have things like $g(X) = 2X$ or $h(X) = X^2$. In other words, functions of random variables can be well-defined.

Similarly, since $p_X(k)$ just a function, then $$p_X(X) = (1-p)^{X-1}p$$ is just some function on $X$, like $g(X)$ or $h(X)$ above. The author claims that this is well-defined. And it seems to be true.

0
On

Let's take an example. $X$ is a random variable which lives on the set $\{1,2,3\}$ and with the following probabilities : $$\mathbb{P}(X=1)=0.5,\mathbb{P}(X=2)=0.25,\mathbb{P}(X=3)=0.25$$

Then, in this case $$p_X (x)=\left\{\begin{matrix} 0.5& if&x=1\\ 0.25 & if &x=2 &or&x=3 \end{matrix}\right.$$

Thus we can make a new random variable $p_X(X)$ living on the set $\{0.25,0.5\}$with the following probabilities :

$$\mathbb{P}(p_X(X)=0.5)=\mathbb{P}(X=1)=0.5\\\mathbb{P}(p_X(X)=0.25)=\mathbb{P}(X=2~or~X=3)=\mathbb{P}(X=2)+\mathbb{P}(X=3)=0.25+0.25=0.5$$