Is there a way to bound expected value with limited information of the CDF?

858 Views Asked by At

Suppose I want to evaluate $E[X]$, where $X$ is a univariate random variable and takes values in $\mathcal{X}$, where the smallest element of $\mathcal{X}$ is 0 and the largest element of $\mathcal{X}$ is $\overline{X}$.

The problem is that I don't have the pdf or cdf of $X$. Instead, suppose that I know the exact value of the CDF at finitely many (but never all) values of the support. So for example, I know $Pr(X\leq x_1)=0.1$, $Pr(X\leq x_2)=0.2$,..., $Pr(X\leq \overline{X})=1$.

Is there a way to bound $E[X]$? In other words, given this partial information what is the highest possible value of the expectation and the lowest possible value and how can I compute it?

Will the characterization of the solutions in discrete and continuous random variable cases differ a lot?

Intuitively, it seems that as the number of points over which I know the CDF increases (even if finitely many), I should have a good idea of the shape of the CDF and be able to bound the expectation. I am not sure how to formalize this intuition or whether it is correct.

3

There are 3 best solutions below

6
On BEST ANSWER

For non-negative random variable expectation equals to the area between CDF and $1$: $$ \mathbb E[X] = \int_0^1 (1-F_X(t))\, dt. $$ So when you know CDF in a number of points, you can draw two stepwise nondecreasing functions through these points, and the expected values for these CDF's bound your expectation from above and from below.

Say, for given values $F_X(x_1)=0.1$, $F_X(x_2)=0.2$, $F_X(\overline X)=1$, the lower function can be $$ F_l(t)=\begin{cases}0,& t< x_1\cr 0.1, & x_1\leq t < x_2, \cr 0.2, & x_2\leq t<\overline X\cr 1, & t\geq \overline X.\end{cases} $$ The expected value for this distribution is $$ 0.1 x_1+0.1 x_2+0.8 \overline X. $$ The upper function is $$ F_u(t)=\begin{cases}0,& t< 0\cr 0.1, & 0\leq t < x_1, \cr 0.2, & x_1\leq t<x_2,\cr 1, & t\geq x_2.\end{cases} $$ The expected value for this distribution is $$ 0.1x_1+0.8x_2. $$ So $\mathbb E[X]$ is bounded by $$ 0.1x_1+0.8x_2 \leq \mathbb E[X] \leq 0.1 x_1+0.1 x_2+0.8 \overline X. $$ The solution for absolutely continuous and discrete distributions do not differ since for the first case you can draw a continuous functions which are as close to stepwise as you wish.

And your intuition is right: with the number of points increasing, lower and upper CDF's stick together.

1
On

Hint: consider the best lower and upper bounds for a random variable satisfying the given conditions. For example, suppose you know $X \ge 0$ with $P(X \le x_1) = 0.1$, $P(X \le x_2) = 0.2$ and $P(X \le \overline{X}) = 1$. Then $L \le X \le U$ where when $0 \le X \le x_1$ (an event of probability $0.1$), $L = 0$ and $U = x_1$, when $x_1 < X \le x_2$ (again with probability $0.1$), $L = x_1$ and $U = x_2$, and when $x_2 < X \le \overline{X}$ (with probability $0.8$), $L = x_2$ and $U = \overline{X}$. Then $\mathbb E[L] \le \mathbb E[X] \le \mathbb E[U]$ where $\mathbb E[L] = 0.1 x_1 + 0.8 x_2 $ and $\mathbb E[U] = 0.1 x_1 + 0.1 x_2 + 0.8 \overline{X}$.

0
On

For your example to maximize $E(X)$, $P(X\lt x_1)=0,\ P(x_1\le X\lt x_2)=0.1,\ P(X=x_2)=0.2$ etc. That is, push all the mass into the upper end of each interval.
To minimize $E(X)$, push all the mass into the lower end of each interval.