I need help with understanding expected values

64 Views Asked by At

Consider a discrete random variable $X$ that takes on the values $x_1,...,x_n$ and for every $x_i$ there's a probability $p(x_i)=p_i$. I am attempting to understand why the expectation $E[X]$ is defined the way it is.

$$E[X]=x_1p_1+...+x_np_n$$

I cannot figure out why it is defined this way and I have already checked my book and wikipedia. I don't understand how this "weighted average" thing is applicable here. $E[X]$ is suppose to be the average value $X$ takes if some experiment is performed many times, correct? Since $p_1+...+p_n=1$ $E[X]$ can expressed as

$$E[X]=\frac{x_1p_1+...+x_np_n}{p_1+...+p_n}$$ but that still does not tell me anything. I would appreciate if someone could help me out here.

2

There are 2 best solutions below

1
On BEST ANSWER

I will try to give you some more insight: suppose you have a random variable $X$ which takes two values (say 4 and 8) with a chance of 50% each. $P(X=4) = P(X=8) =.5 = p_1 = p_2$.

You can imagine that $X$ is linked to a fair coin and every Head corresponds to 4 and tail corresponds to 8.

The expected value is a number which represents the average outcome of the random variable $X$. So assume you have several observations of $X$ (i.e. you throw a coin say 1000 times and note the outcome). Since both outcomes are evenly likely, the arithmetic mean (or average) of this 1000 experiments would be close to 6: $$ Ave_{1000} = \frac{\#occurrences~ of~ 4}{1000}*4 +\frac{\#occurrences~ of~ 8}{1000}*8 \approx \frac{500}{1000} \cdot 4 + \frac{500}{1000}\cdot 8 =6 $$ The expected value can be obtained by taking the number of repetitions to infinity: by doing so the relative frequency of fours would converge to .5 (since probabilities can be defined by the limit of relative frequencies) and so would the relative number of eights. Hence, $$ Ave_{\infty} := E[X] = \frac 12 4 + \frac 12 8 = p_1 4 + p_2 8 = 6. $$ So the expected value is the arithmetic mean of an experiment repeated infinitely often. In general, if you experiment $X$ has $n$ different outcomes denoted with $(x_1, \dots, x_n)$ occurring with probabilities $(p_1, \dots, p_n)$ then $p_i$ will be close to the relative frequency of outcome $x_i$ if the number of repetitions is large. If you have say 1000 repetitions then the arithmetic mean (or average) is given by $$ Ave_{1000} = \sum_{i=1}^n \frac{\#occurrences~ of~ x_i}{1000} \cdot x_i \approx \sum_{i=1}^n p_i \cdot x_i. $$ Again by taking the number of repetitions to infinity the relative frequencies converge to the probabilities and $Ave_N$ will converge to a number which is known as the expected value.

I hope this was somehow helpful.

0
On

Let's take a simple example. Imagine $x_1,x_2,x_3,x_4$ as vertex of a square ABCD. And now you pick randomly one vertex again and again. If you draw vertices many times, then the average of all those draws will be the center (or barycenter) of the square since you have the same probability picking any of the vertices.

The barycenter G of ABCD satisfies : $$0=\vec{GA}+\vec{GB}+\vec{GC}+\vec{GD}=\frac 1 4\vec{GA}+\frac 1 4\vec{GB}+\frac 1 4\vec{GC}+\frac 1 4\vec{GD}$$ Which can be rewritten as :

$$\vec{OG}= \frac 1 4\vec{OA}+\frac 1 4\vec{OB}+\frac 1 4\vec{OC}+\frac 1 4\vec{OD}$$

Picking a vertex is the same as picking a vector. Then if we take this back into the world of probabilities, it tells you that the expectation ($\vec{OG}$) is equal to the sum of the probability of picking a vertex ($\vec{OA}$ for example) multiplied by the vector.

Now if instead of picking a vertex with equals probabilities, we weight the vertices differently (or not). Let $X$ be the random variable associated to the draw of a vertex.

Then $P(X=x_i)=p_i$ for $i=1,2,3,4$ and $p_1+p_2+p_3+p_4=1$.

The weighted barycentre verifies :

$$0=p_1\vec{GA}+p_2\vec{GB}+p_3\vec{GC}+p_4\vec{GD}$$

Which again can be rewritten as :

$$(p_1+p_2+p_3+p_4)\vec{OG}=p_1\vec{OA}+p_2\vec{OB}+p_3\vec{OC}+p_4\vec{OD}$$

Then we finally get : $$\vec{OG}=\dfrac 1 {(p_1+p_2+p_3+p_4)} p_1\vec{OA}+p_2\vec{OB}+p_3\vec{OC}+p_4\vec{OD}$$

And by the same reasoning then before, you can interpret those quantities as expectation and probabilities.