Simple Monte Carlo integration about b-a

106 Views Asked by At

enter image description here

enter image description here

I don't understand why we need to * (b-a). Here b-a = 4-2=2.I think we already make random x from 2 to 4. We've already considered the interval (not 0,1). Why do we times (b-a) again?

2

There are 2 best solutions below

3
On BEST ANSWER

They are approaching this integration by approximating the average value of $e^{-x}$ over $[2,4]$ (I.e., the average y value) and multiplying it by the length of the x interval to get the estimate of the integral.

More precisely:

$$ \bar{f}_{[a,b]}= \frac{\int_a^b f dx}{b-a}$$

The code mean(exp(-x)) is estimating $\bar{f}_{[a,b]}$ so you need to multiply by the “base” of the rectangle you are making.

0
On

Maybe comparing Monte Carlo integration with Riemann approximation will help you understand the former.

Suppose we want to approximate the integral $J = \int_a^b e^{-x}\,dx$ by $n=1000$ rectangles of widths $w$ and heights $h = e^{-m},$ where $m$ is the midpoint of the base of a rectangle. In R,

a = 2;  b = 4
n = 1000
w = (b-a)/n
m = seq(a+w/2, b-w/2, length=n)
h = exp(-m)
sum(w*h)
[1] 0.1170196

This is a deterministic computation, so each run will give the same answer.

You are integrating the density function of the distribution $\mathsf{Exp}(\mathrm{rate\,}=\lambda = 1).$ In R, the CDF of this distribution is pexp (with default $\lambda=1).$ So a Riemann approximation with 1000 evenly spaced grid points as centers of rectangles has given us four-place accuracy.

diff(pexp(c(2,4)))
[1] 0.1170196

A basic Monte Carlo integration substitutes for the evenly spaced points above, randomly chosen $m$ distributed as $\mathsf{Unif}(a,b).$ For this one-dimensional integration, randomly chosen grid points are not quite as efficient as a precisely evenly spaced grid. So I will use $n = 10\,000$ random points (uniformly distributed).

set.seed(2021)
a = 2;  b = 4
n = 10000
w = (b-a)/n
m = runif(m, a, b)
h = exp(-m)
sum(w*h)
[1] 0.01189166

This is a random procedure, so (for different seeds, or no set seed) the result will be slightly different on each run.

The average widths are $w = (b-a)/n.$ If I had used mean instead of sum as you did, that would take care of the $1/n,$ but the length of the interval, over which the integration takes place, is still needed.


Note: Sampling method. For Monte Carlo integration in one dimension, a 'sampling method' is sometimes used (often more for its ease of programming than for its efficiently). If we use the R function rexp to sample a million observations from $\mathsf{Exp}(\lambda=1),$ then we can ask what proportion of them lie between $2$ and $4.$

set.seed(120)
x = rexp(10^6)
mean((x>2)&(x<4))
[1] 0.117494

The vector (x>2)&(x<4) is a logical vector with a million elements TRUE or FALSE; the mean of a logical vector is its proportion of TRUEs.

This method is especially convenient when it would be some trouble to find the density function of the distribution of interest. Suppose time to completion of a 2-phase process is the sum of $Z \sim \mathsf{Norm}(\mu=30,\sigma = 5)$ and $X \sim \mathsf{Exp}(\lambda = .05).$ The the waiting time to completion is $W = Z+X.$ The average waiting time $E(W) = E(Z)+E(X) = 30+20 = 50,$ and $P(W > 60) \approx 0.22935 \pm 0.00084.$

set.seed(109)
z = rnorm(10^6, 30,5)
x = rexp(10^6, .05)
w = z + x
mean(w > 60)
[1] 0.229349
2*sd(w > 60)/1000
[1] 0.0008408287


hist(w, prob=T, br=60, col="skyblue2", 
      main="Time to Completion")
 abline(v = 60, col="red", lwd=2)

enter image description here


Note: Higher dimensions. The accuracy of a basic Monte Carlo integration depends in part on the "wiggliness" of the function $f(x)$ being integrated. However, generally speaking, Monte Carlo integrations in two (or higher) dimensions with $n$ randomly chosen points are about as good as (or better than) a Riemann approximation with a grid of $n$ evenly spaced points.

Example: The integral of a bivariate standard normal distribution (correlation $0)$ over first quadrant of the unit circle can be shown to be $J = 0.0984.$

pchisq(1,2)/4
[1] 0.09836734

We show Monte Carlo integration with about $100^2\pi/4$ points randomly distributed in the triangle. Notice we average the random heights and multiply by the area of the quarter circle.

set.seed(1234)  # 1234
n = 100;  u1 = runif(n);  u2 = runif(n)
h.sq = dnorm(u1)*dnorm(u2)
h.qd = h.sq[u1^2 + u2^2 < 1]
(pi/4)*mean(h.qd)
[1] 0.09813584