What is the expected value of a probability density function (PDF) itself?

32.7k Views Asked by At

The expected value of a function can be found by integrating the product of the function with the probability density function (PDF). What if I want to find the expected value of the PDF itself? This is probably stupidly simple but I am lacking an insight.

Let me explain why I am asking this. In Monte Carlo integration, the expected value of the following term, $F$, gives us the integral $\int_a^b f(x)dx$:

$F = \frac{1}{N}\sum_{i=1}^{N}\frac{f(x_i)}{p(x_i)}$, where $p(x)$ is a PDF from which are drawing samples. We use this to estimate the value of an otherwise difficult to compute integral by averaging samples drawn from a PDF.

In the proof of this, the book I am following goes as follows:

$E[F] = E\left[\frac{1}{N}\sum_{i=1}^{N}\frac{f(x_i)}{p(x_i)}\right] \quad(1)\\ = \frac{1}{N}\sum_{i=1}^{N}E\left[\frac{f(x_i)}{p(x_i)}\right] \quad(2)\\ = \frac{1}{N}\sum_{i=1}^{N}\dfrac{\int_a^b f(x)p(x)dx}{p(x)} \quad(3)\\ = \frac{1}{N}\sum_{i=1}^{N}\int_a^b f(x)dx \quad(4)\\ = \int_a^b f(x)dx \quad(5)$

The place where I am lost how we transition from step 2 to step 3. I am assuming that we are distributing the expected value to the numerator and the denominator of the fraction. In the numerator, we then have $E[f(x_i)]$, which can be equated to $E[f(x)]$ as $x_i$ is a random variable (we can do this, right?). And we know that $E[f(x)] = \int_a^b f(x)p(x)dx$, so I see where the numerator in step 3 comes from. But I cannot see how we obtain $E[p(x_i)] = p(x)$. That's why I asked how can we determine the expected value of a PDF.

At this point in my edit of the question, I realized that it may be incorrect to distribute the expected value over the ratio. In this case, we do not have the problem that is asked. Does the following make sense then?

$E\left[\frac{f(x_i)}{p(x_i)}\right] = E[g(x_i)]\\ = \int_a^b g(x)p(x)dx\\ = \int_a^b \frac{f(x)}{p(x)}p(x)dx\\ = \int_a^b f(x)dx$

If this is correct, I guess I have answered my own question but it would be great to get confirmation.

4

There are 4 best solutions below

1
On

Random events have two characteristics: a value and a probability. It's clear in the discrete case; a normal die has a 1/6 probability of rolling each of one through six. You can find the expected value of one roll, it's $\frac{1+2+3+4+5+6}{6}$. But you can't find the expected value of the probabilities, because it's just not a meaningful question.

The same is true for continuous random events. "the function" is the value of the event, and the PDF is the probability. So you can find the expected value of the event, with the understanding that its values all have probability given by the PDF.

To expand a little bit, you can think of the pdf as representing values instead, but then you would need to specify a probability function for those values to occur, so you would need a different PDF.

0
On

A random variable $X$ has eventually an expectation which - if $X$ has a PDF $f$ - can be found as: $$\mathbb EX=\int xf(x)dx$$

Further for a suitable function $g:\mathbb R\to\mathbb R$ we can find the expectation of $g(X)$ by: $$\mathbb Eg(X)=\int g(x)f(x)dx\tag1$$ You want to find the "expectation of the PDF". However, for that the PDF must be a random variable, so what you ask is absurd.

It is a fact though that $f(X)$ is a random variable. Applying $(1)$ its expectation can be found as:$$\int f(x)f(x)dx=\int f(x)^2dx$$Is this what you want maybe? Btw, I never engaged any meaningful stuff on this.

0
On

I guess it's just the integral $\int_D p(x)^2 dx$ where $D$ is the domain of your PDF. The answer I would assume varies from PDF to PDF. For example in the case of the exponential distribution $\mathbb{E}[p(x)]=\frac{\lambda}{2}$. However I do not know if anyone has ever considered these. The only thing that comes close that I can think of is the entropy $S=\mathbb{E}[-\ln(p(x))]$

0
On

Without touching the context of the question, I just want to give a meaningful example of "expected value of a PDF".

Let $p(x)$ be a pdf, then by marginalization, then conditional probability, then definition of expectation, we have:

$$ p(x) = \int_y \mathrm dy\, p(x, y) = \int_y \mathrm dy\, p(x \mid y) p(y) = \mathbb E_{y \sim p(y)}[p(x \mid y)] $$

Why it holds? $p(x \mid y)$ is a deterministic function with respect to $y$. This is just the same thing as $\mathbb E [g(x)] \equiv \int_x\mathrm dx\, g(x) p(x)$.

You may find similar constructions in, e.g., this VAE tutorial.