Sampling Methods Explaination: Pattern Recognition and Machine Learning Bishop

196 Views Asked by At

I am reading chapter 11, sampling methods, from the book 'Pattern Recognition and Machine Learning' by Bishop. In the introduction, in short, he evaluates the expectation of some function $f(z)$ with respect to a probability distribution $P(z)$ where $z$ is the random variable. He writes

\begin{align} E(f) = \int f(z)p(z)dz \end{align}

Now he says that we suppose that such expectations are too complex to be evaluated exactly using analytical techniques, so we use sampling. (Verbatim from the book) And the idea behind sampling methods is to obtain a set of samples $z^{(l)}$, (where $l=1,....,L$) drawn independently from the distribution $P(z)$. This allows the expectation to be written as

\begin{align} \hat{f} = \frac{1}{L} \sum_{l=1}^{L}f(z^{(l)}) \end{align}

Now further in the official solution for exercise 11.1 where he proves that mean of $\hat{f}=E(f)$, he calculates

\begin{align} E[\hat{f}] = \frac{1}{L}\sum_{l=1}^{L}\int f(z^{(l)}) p(z^{(l)})dz^{(l)}\tag 1 \end{align}

Now I did not understand this integral.

My argument

For simplicity, let us assume the underlying distribution $P(z)$ to be a one dimensional standard normal distribution.

  1. When the author samples $z^{1}$, I assume that he drew a fixed $n$ values which would look something like $z^{1}= \{0.2,0.30.2323,... ,0.8\}$.
    Now in that case how would the integral in $(1)$ look like? For example, what would be the limits of integration?
  2. If $z^{1}$ is a discrete set then $f(z^{1})$ would be too, in which case how would the integration be defined in the first place?
1

There are 1 best solutions below

0
On

Let's try to put some order here:

the general Expected Value of a continuous random variable $Z$, defined over a set $S_{Z}$ is given by:

$$\mathbb{E}(Z) = \int_{S_{Z}}zp(z)dz$$

where $p(z)$ is $Z$'s density function. It is immediate to see that the best estimator for such Expected Value is the sample mean:

$$\overline{Z} = \frac{1}{L} \sum_{l = 1}^{L}Z^{l} \Rightarrow \mathbb{E}(\overline{Z}) = \mathbb{E} \left ( \frac{1}{L} \sum_{l = 1}^{L}Z^{l} \right ) = \frac{1}{L} \sum_{l = 1}^{L}\mathbb{E}(Z^{l}) = \mathbb{E}(Z^{l}) = \mathbb{E}(Z)$$

assuming that all the individuals in the sample are drawn from the distribution $p(z)$. Just to not get confused by notation, I have used capital letters since the sample is actually a set of $L$ random variables $Z^{l}$, $l = 1,...,L$ identically distributed: when we instead talk about their realization, we use $z$.

Now, let suppose instead that we are interested not in the Expected Value of $Z$, but of a transformation of it, let say $f(Z)$. We have that, per definition:

$$\mathbb{E}[f(Z)] = \int_{S_{f(Z)}}f(z)p(z)dz$$

Now, how do we estimate such Expected Value? Using once again the Sample Mean, but in this case not the Sample Mean of the $z^{l}$s, but rather of $f(z^{l})$. Why? Sticking to the presented notation:

$$\hat{f} = \frac{1}{L} \sum_{l = 1}^{L}f(Z^{l}) \Rightarrow \mathbb{E}(\hat{f}) = \mathbb{E} \left ( \frac{1}{L} \sum_{l = 1}^{L}f(Z^{l}) \right ) = \frac{1}{L} \sum_{l = 1}^{L}\mathbb{E}[f(Z^{l})] = \mathbb{E}[f(Z^{l})] = \mathbb{E}[f(Z)]$$

where, indeed, since the sample is composed by identically distributed variables:

$$\mathbb{E}[f(Z^{l})] = \int_{S_{f(Z^{l})}}f(z^{l})p(z^{l})dz^{l}, \ \forall l \in \left \{ 1,...,L \right \} $$

Now, let suppose that we have had the realization you indeed suggest.

First of all, notice that, considering a univariate case, $\boldsymbol{z}$ $=$ $\left \{ 0.2,0.3,0.2323,...,0.8 \right \}$ $=$ $\left \{ z^{1},z^{2},z^{3},...,z^{L} \right \}$ is $already$ your entire sample, not a single individual of it, since by univariate assumption $z^{l}$ $\in$ $\mathbb{R}$ for all $l$.

Then, consider, wlog, $f(z)$ $=$ $z^{k}$, $k$ $\in$ $\mathbb{N}$ (example chosen since useful for variance).

Then, for instance, if $S_{Z}$ $=$ $ \left [ 0,2 \right ]$ for some reason, now $S_{f(Z)}$ $=$ $ \left [ 0,2^{k} \right ]$

Finally, what is the best estimator for the Expected Value of $f(z)$ $=$ $z^{k}$? As seen before, the Sample Mean of the extracted $z^{l}$s to the power of $k$.

$$\hat{f} = \frac{1}{L} \sum_{l = 1}^{L}(z^{l})^{k} = \frac{1}{L} \left ( 0.2^{k} + 0.3^{k} + 0.2323^{k} +...+ 0.8^{k} \right )$$