Estimator for $\log \mathbb{E}\left[h(x)\right]$

120 Views Asked by At

I have a function $h(x)$, which maps $\mathbb{R^d}$ on $[0,1]$ . $p(x)$ is the probability distribution for $x$ on $\mathbb{R^d}$. I have the term $\log \mathbb{E}\left[h(x)\right]$ where the expectation is over $p(x)$. My goal is to provide an estimate for this term using the iid sample from $p(x)$: $\{x_1,\dots,x_n\}$.

I can intuitively build the following estimator: $\log \left( \dfrac{1}{n}\sum\limits_{i=1}^{n}h(x_i)\right)$.

This estimator is biased since when taking its expectation over $\{x_1,\dots,x_n\}$:

$$\mathbb{E}\left[\log \left( \dfrac{1}{n}\sum_{i=1}^{n}h(x_i)\right) \right] \leq \log \mathbb{E}\left[ \left( \dfrac{1}{n}\sum_{i=1}^{n}h(x_i)\right) \right] = \log \mathbb{E}\left[h(x)\right]$$

where I used the Jensen's Inequality. The thing which makes it not a very suitable choice for me is that I can't come up with a way to assess its biasedness except that I know that it underestimates the target value on the average.

My question is, is there any chance that I can find a better estimator for $\log \mathbb{E}\left[h(x)\right]$, for example like the estimator of the variance, where one can convert it to the unbiased estimator by multipliying by $\dfrac{n}{n-1}$? How should I proceed here?

1

There are 1 best solutions below

1
On BEST ANSWER

One idea is to use a Taylor expansion of $\log(Y)$ at $\mathbb E[Y]$, where $Y = h(x)$. See here for reference.

Second-order expansion gives the expression $$ \log(\mathbb E[Y]) \approx \mathbb E[\log(Y)] + \frac{\text{Var}(Y)}{2\mathbb E[Y]^2} $$ Then use unbiased estimators: $$ \frac{1}{n}\sum_i\log(x_i) + \frac{\hat \sigma^2}{2\hat \mu ^2} $$ You can add more terms if you want: $$ \log(\mathbb E[Y]) \approx \mathbb E[\log(Y)] + \frac{\text{Var}(Y)}{2\mathbb E[Y]^2} - \frac{\mathbb E[(Y - \mathbb E[Y])^3]}{6\mathbb E[Y]^3} $$

The problem here, however, is that the accuracy of this will depend on the moments of $X$. If the ratio between the variance and expectation is small for example, it will be more accurate. So for this approach, the quality of the estimator would depend on the actual distribution of $h(x)$.