I don't understand why $E[Y]$ approaches to $e^{\mu}$, and why $E[Y] - e^{\mu}$ is not just zero... Can someone please help me with this problem?
Suppose $X$ is normal with mean $\mu$ and variance $\sigma^2$, which is small. Let $Y = e^X$. It seems intuitive that $E[Y ] ≈ e^{\mu}$, because $X$ is likely to be close to $\mu$. Find an approximate expression for $E[Y ] − e^{\mu}$ that involves a power of σ.
Hint: Write $X = \mu + \varepsilon$, where $\varepsilon$ is small (with high probability) and a Gaussian random variable. Expand $e^X$ in a Taylor series of the appropriate order (the lowest order that gives a non-zero correction to $e^{\mu}$) in $\varepsilon$. You will find that $e^{\varepsilon} \approx 1 + \varepsilon$ is not enough.
Remember Jensen's inequality: if $f$ is convex, $$ E[f(X)] \geqslant f(E[X]), $$ so you shouldn't expect $E[e^X]=e^{E[X]}$. What it wants is: $$ e^x = e^{\mu}e^{x-\mu} = e^{\mu} \left(1+ (x-\mu) + \frac{1}{2}(x-\mu)^2 + o((x-\mu)^2) \right). $$ Then calculating $E[e^X]$ approximately gives: $$ E[e^X] = e^{\mu}E[1+(X-\mu)+\tfrac{1}{2}(X-\mu)^2+o((X-\mu)^2)] \\ = e^{\mu} (1+0+\tfrac{1}{2}\sigma^2 + o(\sigma^2)), $$ since $\sigma^2$ is small and the normal distribution has nice enough tails that we can push the $E[]$ into the $o()$. The first three terms are definitions: $E[1]=1$, $E[X-\mu]=E[X]-\mu=0$, $E[(X-\mu)^2]=\sigma^2$. Hence $$ E[e^X] \approx e^{\mu}(1+\tfrac{1}{2}\sigma^2). $$
In fact, we can find $E[e^X]$ exactly: $$ E[e^X] = \frac{1}{\sigma\sqrt{2\pi}}\int_{-\infty}^{\infty} e^x e^{-(x-\mu)^2)/(2\sigma^2)} \, dx. $$ We complete the square on the exponent: $$ -\frac{1}{2\sigma^2} \left(x^2-2\mu x+\mu^2 -2\sigma^2 x \right) = -\frac{1}{2\sigma^2} \left((x-\mu-\sigma)^2+\mu^2 -(\mu+\sigma)^2 \right) $$ Rearranging, $$ E[e^X] = \frac{e^{((\mu+\sigma)^2-\mu^2)/(2\sigma^2)}}{\sigma\sqrt{2\pi}}\int_{-\infty}^{\infty} e^{-(x-\mu-\sigma^2)^2)/(2\sigma^2)} \, dx. $$ The integral is just the density of a normal distribution with mean $\mu+\sigma^2$ and variance $\sigma^2$, so it integrates to $1$, and we find $$ E[e^X] = e^{((\mu+\sigma^2)^2-\mu^2)/(2\sigma^2)} = e^{\mu}e^{\sigma^2/2}, $$ which agrees with what we found to $O(\sigma^2)$!