Suppose we have a scalar, real-valued function of two variables $F(x,y)$ where ($x$,$y$) belongs to a discrete domain $D$ of $F$, which has finite number of elements. Let $\mid D \mid$ be the number of elements in $D$. One way to define the average of $F$ is as follows:
$$ \text{Average}(F) = \frac{\sum_{\forall (x,y) \in D} F(x,y)}{\mid D \mid} $$
This calculation is tractable as long as $\mid D \mid$ is not very large; however, I'm dealing with a situation where the size of the domain is exponentially large.
My question: how to approximate the average in case of an exponentially large domain?
My thoughts: I hope someone criticize its feasibility:
I'm thinking of a Monte-Carlo approach, where I do $k$ rounds of estimations of the average, and then find the average of the averages.
One round of estimation: pick a random $d$, and generates $d$ uniformly distributed pairs from the domain $D$. Sum the function on these pairs, and divide by $d$ to get the round estimation of the average.
After $k$ rounds, we computed $k$ averages: sum them up, divide the result by $k$, which results in the final estimation of the average.
Here are my concerns about my approach:
- Is my estimation approach (per one round) correct? In other words, does the round average represent a good-enough estimation of the real average?
- In case this approach is correct, is there a way to measure the approximation error?
Thanks.
This sort of problem can be addressed using the Metropolis-Hastings algorithm, a Markov Chain Monte Carlo approach. Care must be taken, however, to avoid situations where a sufficiently pathological function and a poorly chosen proposal distribution could lead to a failure of detecting areas of the function with a large contribution. In general, MCMC methods can yield better results than naive MC methods if the dimensionality is high (or equivalently, the sample space is extremely large).
See http://en.wikipedia.org/wiki/Metropolis%E2%80%93Hastings_algorithm for more details and the principles of implementation.