Show that the average of n observations is equal to the expected value

315 Views Asked by At

Show that the average of n observations is equal to the expected value with the density function with index k is equal to the number of observations equal to k divided by the total number of observation, that is, $$\frac{y_1 +...+y_n}{n} = \sum_{k=1}^{n}k*p_k$$ there $p_k=\#(y_i: y_i = k)/n$

Any hints? I really don't know how to show the statement above.

3

There are 3 best solutions below

1
On BEST ANSWER

Let it be that the observations take values in set $\left\{ k_{1},\dots,k_{m}\right\} $ (where the $k_{i}$ are distinct) and that for $i=1,\dots,m$ exactly $n_{i}$ of the observations take value $k_{i}$.

Then: $$\tag{1}n_1+\cdots+n_m=n$$

and:

$$\tag{2}y_{1}+\cdots+y_{n}=k_{1}n_{1}+\cdots+k_{m}n_{m}$$

Note that for $i=1,\dots,m$ here: $$\tag{3}p_{k_{i}}=\frac{n_{i}}{n}$$

Dividing both sides in the second equality by $n$ leads to: $$\tag{4}\frac{y_{1}+\cdots+y_{n}}{n}=k_{1}p_{k_{1}}+\cdots+k_{m}p_{k_{m}}$$

The RHS you could also write as $\sum_k kp_k$ where $p_k=0$ if no observation takes value $k$.

1
On

Are the $y_i$'s in the range $\{1,\dots , n\}$? I think this is needed.

Assumin they are: Rewrite $p_k$ in terms of $y_i$. It's the number of $y_i$'s that equal $k$, divided by $n$. How can you write this differently?

1
On

Actually, it's not (strictly speaking, almost never for continuous random variables) precisely equal but that average value tends to e.v.
Just take a look at what expected value is. To make the reasoning more simple, the following is for discrete random variable but it's absolutely legit for continuous variables too.

E.V. of random variable $x$ is defined as $\sum_{i=1}^{n}{p_ix_i}$, where $x_i$ are all possible values of $x$ and $p_i$ is a probability of getting $x_i$.

In other words, E.V. is an average value you get when number of your observations tends to infinity, so the empirical distribution is getting more and more similar to analytical. This is also called law of large numbers and you may read more on wikipedia about it. I will just give you a very simple example. Consider a coin tossing. It's obvious that probabilities of heads and tails are equal for fair coin but this doesn't mean that you will get precisely $50$ heads and $50$ tails after $100$ flips, however the more flippings you will do, the more $(heads)/(tails)$ ratio will be closer to $1$. This refers to another famous probability theory theorem known as CLT (central limit theorem) which says that distribution of a large number of observations of any random variable tends to normal distribution (with different parameters, of course), so it looks like a bell which peak is an expected value of your random variable. And again, the more observations you do, the more similar to "bell curve" your empirical distribution is.