Is is possible to calculate the expected value for a continuous variable using the sum of discrete distribution?

171 Views Asked by At

I would like to know if it is possible to calculate the expected value for a continuous random variable as:

$ \mathop{\mathbb{E}}(x)=\int xf(x)d(x)=\frac{1}{N}\sum_{i=1}^{N}X_{i} $

If that is true, how can we convert the integral to the sum? Also, is the sum part $\frac{1}{N}\sum_{i=1}^{N}X_{i}$ used to calculate the discrete uniform distribution?

1

There are 1 best solutions below

0
On

I think you are asking if it is possible to estimate the mean of a continuous distribution by taking the sample mean of a large sample (of size $N$) from that distribution. If so, the answer is Yes, but be aware that the result is an estimate and that the error of the estimate depends on the sample size $N$ (typically, smaller error for larger $N$). Here are some examples in which we sample at random from a few known distributions. (Sampling done using R statistical software.)

1) $N=50$ observations from the distributon $\mathsf{Norm}(\mu = 75,\, \sigma=3).$ The R statement rnorm samples at random from a specified normal population. You will get a somewhat different estimate each time you take a sample and find its mean. In 95% of runs the answer will be within

x = rnorm(50, 75, 3);  mean(x);  2*sd(x)/sqrt(50)
## 74.84001     # aprx E(X) = 75
## 0.7656806    # 95% margin of simulation error

2) $N = 5000$ observations from $\mathsf{Norm}(\mu = 75,\, \sigma=3).$

x = rnorm(5000, 75, 3);  mean(x);  2*sd(x)/sqrt(5000)
## 75.00454
## 0.08390667

3) $N = 1000$ observations from $\mathsf{Unif}(a = 50, b = 100).$ The mean of this population distribution is $\mu = 75.$

y = runif(1000, 50, 100);  mean(y);  2*sd(y)/sqrt(1000)
## 75.54673
## 0.900703

4) $N = 10^6$ observations from an exponential distribution with rate 1/5. This distribution has mean $\mu = 5.$

w = rexp(10^6, 1/5);  mean(w);  2*sd(w)/sqrt(10^6)
[1] 4.997817
[1] 0.00998948

5) $n = 10,000$ observations from the beta distribution with shape parameters $\alpha = 1.2$ and $\beta = 2.8.$ You can look in your text or read about the beta distribution on Wikipedia, if you have not encountered it yet. Its mean is $$\mu = \int_0^1 x\,\frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha)+\Gamma(\beta)}\,x^{\alpha-1}(1-x)^{\beta-1}\,dx = \frac{\alpha}{\alpha+\beta} = 0.3.$$

v = rbeta(10000, 1.2, 2.8);  mean(v);  2*sd(v)/sqrt(10000)
[1] 0.3017991
[1] 0.004096549

Notes: (a) The Law of Large Numbers (LLN) guarantees that the mean of a sample converges 'in probability' to its population mean as $N \rightarrow \infty.$ By definition, that is $\lim_{N \rightarrow \infty}P(|\bar X - \mu| < \epsilon) = 0,$ for any $\epsilon > 0.$

(b) The margins of simulation error depend on the dispersion of the population and the sample size. For most non-normal distributions the same formula I have used in the examples above works because of the Central Limit Theorem (CLT).

(c) As in @Geronimo's Comment (+1), this is one form of 'Monte Carlo Integration.' It is a method often used in applications, but you don't literally "convert the integral into a sum." As remarked above you use the LLN and the CLT to get an approximate answer and to have some confidence how accurate that answer is. (Of course, it's only a 95% confidence interval, so there are some occasions on which the estimate is farther from the target than hoped for.)