I am working on a nested integration problem and want to develop an efficient estimator for said problem.
The problem has the form: $\mathbb{E}_x\left[F(x,\mathbb{E}_y\left[G(y, x))\right]\right] $ ,
where $F$ is a linear function acting on $x$ and the expectation of $G$. More precisely, $F$ has the simple form $F(a,b) = a-b$.
Naively, one could derive an estimator in $\mathcal{O}(N \times M)$, where for every outer sample, we would use $M$ samples to estimate the inner integral. However, this paper (Sec 4.1) says that "it is a well known result" for such an expectation to fulfil the relation
$\mathbb{E}_x\left[F(x,\mathbb{E}_y\left[G(y, x))\right]\right] = \mathbb{E}_x\left[\mathbb{E}_y\left[F(x, G(y, x))\right]\right] $,
which has the unbiased estimator in $\mathcal{O}(N)$
$\frac{1}{N} \sum_i^N F(x_i, G(y_i, x_i))$.
Unfortunately, I fail to see why the above equality on the expectations holds. I'm thinking it has to do with the linearity property of $F$, but would be happy if somebody could elaborate. Thanks!
$\mathbb{E}_y(F(x, G(y, x)))=\mathbb{E}_y(x - G(y, x))=\mathbb{E}_y(x)-\mathbb{E}_y(G(y, x))= x - \mathbb{E}_y(G(y, x))=F(x, \mathbb{E}_y(G(y, x)))$