How to partition a probability distribution function to find expected value?

254 Views Asked by At

Given a probability distribution function $p(x)$, the corresponding cumulative distribution function $F(x)$ and $y(x)$ which is computationally expensive to evaluate, I would like to find the expected value of $y$. To do this within a reasonable amount of time, I could only evaluate $y(x)$ $n$ times. How should I find "key points" $x_1, x_2, \dots, x_{n}$ so that $\sum_i y(x_i) (F(x_i)-F(x_{i-1}))$ best approximates $\int y(x)p(x)\ \mathrm{d}x$? I imagine I'd have to evaluate more points where $p(x)$ is large.

A general solution would be great, but if that's not possible we could assume $p(x)$ is a Gaussian function and $y(x)$ is the sum of a series of sinusoidal functions.


Update:

If it helps we can also make the assumptions that

  1. $-1\leq x\leq 1$, and $p(x)$ is a truncated Gaussian with $\mu=0,\sigma=1/4$
  2. $y(x)=Y_1 \sin(k x+\phi)+Y_0$
  3. The Riemann sum I proposed above is to be replaced with a better discrete integration method $\mathrm{int}((x_1,y_1),(x_2,y_2),\cdots(x_n,y_n))$

Update 2:

I am using a naive strategy which is to divide $[0,1]$ evenly and find their inverse CDF, i.e. $u_i=F^{-1}(i/n), \quad i=0,1,\cdots,n$, and then let $x_i=(u_{i-1}+u_i)/2$. This way each subdivision of the interval of integration has the same weight. This reaches the same error level as equally-spaced partitioning with a constant number of less evaluations of $y$. But I can not prove its optimality and it does not generalize well when dealing with multiple integrals.

I found here how Mathematica handles with this. But none of them are as good as my naive method given the same amount of evaluation of $y$. I think it's because they are not fully utilizing the information on the easily calculable $p$.