Intuition for weighted average. Why $\frac{w_1}{w_1 + w_2}x_1 + \frac{w_2}{w_1 + w_2}x_2 = \frac{\sum_i w_ix_i}{\sum_i w_i}$?

737 Views Asked by At

I know $\dfrac{w_1}{w_1 + w_2}x_1 + \dfrac{w_2}{w_1 + w_2}x_2 = \dfrac{\sum_i w_ix_i}{\sum_i w_i}$, because $\sum_i w_i$ is common denominator. I'm not asking about this algebra. It's intuitive that $\dfrac{w_i}{w_1 + w_2}$ weighs $x_i$.

Intuitively, why's $\dfrac{\sum w_ix_i}{\sum w_i}$ Weighted Average? You are summing $w_ix_i$ and $w_i$ separately. Thus you lost information, because the weight for $x_i$ doesn't appear. When you sum $\sum w_ix_i$ and and $\sum w_i$, these end as totals. They inform nothing about weights! And you can't recover the weights for just these sums!

Can picture explain?

enter image description here

4

There are 4 best solutions below

1
On

Suppose $x_1, \ldots, x_5$ are your grades (as percentages out of 100) for your $5$ homeworks in a class, $x_6$ is your grade on the midterm exam, and $x_7$ is your grade on the final exam.

In an unweighted average, each homework and exam is worth the same amount, so the unweighted average is $\frac{x_1 + x_2 + \cdots + x_7}{7}$. This is the weighted average formula with $w_1 = w_2 = \cdots = w_7 = 1$.

However, maybe the exams are worth a lot more than each homework. Maybe the midterm is worth $3$ times as much as a homework, and the final is worth $5$ times as much as a homework. Then a weighted average with $w_1 = \cdots = w_5 = 1$, $w_6=3$, and $w_7 = 5$ can account for this. $$\frac{x_1 + x_2 + x_3 + x_4 + x_5 + 3 x_6 + 5 x_7}{1 + 1 + 1 + 1 + 1 + 3 + 5}$$

You can see that getting a score of $90\%$ on a midterm is like getting $90\%$ on three separate homeworks instead.

2
On

Derivation

Firstly, we need to understand the meaning of the weight terms i.e., $w_i$. They mean to represent the probability/influence/frequency of the value ($x_i$) on the final outcome (whether it be the nearest point, the chance of winning, pixel color).

i.e., $x = f(w_1, w_2,w_3, ... , x_1, x_2, x_3, ...)$

if all $x_i = K$ a constant, then $x = K$

that means that the $f(w_1, w_2,w_3, ... , x_1, x_2, x_3, ...)$ is linear with respect to $x_i$

i.e., $x = f_1(w_1, w_2, w_3,...)x_1 + f_2(w_1, w_2, w_3,...)x_2 + ...$

Let $x_i = K$ then

$1 = f_1(w_1, w_2, w_3,...) + f_2(w_1, w_2, w_3,...) + ...$

if all $w_i=0$ for $i\ne j$, then $x = x_j$

$\implies f_j(0,0,w_i,..) = \left\{ \begin{array}{ll} 1 & \mbox{if $i = j$};\\ 0 & \mbox{if $i \ne j$}.\end{array} \right. $

There are many solutions that satisfy these equations

A simple set of solution is

\begin{equation} f_j(w_1, w_2, w_3,...) = \frac{w_j^n}{\sum_i w_i^n} \end{equation}

In most cases, $n$ is taken as $1$


Probability

Another name for an average is the expected value i.e., the most probable value considering the probability/influence/frequency of the known values.

For example, consider tossing a fair coin will result in either heads or tails.

$\implies P(heads) = \frac{\text{total number of heads}}{\text{total number of trails}}$

which is computed by performing several numbers of trials.

Consider if you toss a coin and it lands heads then you win $\$3$ and lose $\$1$ if it is tails.

Then the expected money that you will win for one toss is

$ P(heads)\times\$3 + P(tails)\times-\$1 = \$1 $

$ \frac{\text{total number of heads}\times\$3 + \text{total number of tails}\times-\$1}{\text{total number of trails}} = \$1 $

This is same as $\frac{w(heads)\times \$3 + w(tails)\times -\$1}{w(heads) + w(tails)}$


Intuition:

Intuitively, why's $\dfrac{\sum w_ix_i}{\sum w_i}$ Weighted Average? You are summing $w_ix_i$ and $w_i$ separately. Thus you lost information because the weight for $x_i$ doesn't appear. When you sum $\sum w_ix_i$ and $\sum w_i$, these end as totals. They inform nothing about weights! And you can't recover the weights for just these sums!

Firstly, we have to understand that we don't need to recover the weights. Weights are not meant to influence the final result directly.

They only meant to "represent" the "relative influence" of the individual value ($x_i$). As long as they do this (represent influence) we don't need them (more like we shouldn't need them) directly influencing the final solution.

That means when you say 2*x + 3*y that means the final solution is influenced by x, y in the ratio of 2:3. That means 4*x + 6*y should also represent the same influence since 4:6 = 2:3 (Remember, relative influence).

So how do we generalize it? By normalizing it by total influence i.e., $\sum_i w_i$

$$\frac{2*x+3*y}{2 + 3} = \frac{4*x+6*y}{4 + 6} = z$$

This is very similar to the basics of probability. Example, if we toss a fair coin and we win \$5 for heads and lose \$3 for tails.

Then the average amount that you can win for 4 tosses is

$2 \text{tosses} \times \$5 + 2 \text{tosses}\times -\$3 = \$6$

Then what is the average amount that you can win for 1 toss is

$\$6 / 4 \text{tosses} = (2 \text{tosses} \times \$5 + 2 \text{tosses}\times -\$3) / 4 \text{tosses}$


I wrote this with only the purpose of understanding the intuition of the weighted average. If there are any errors in analysis or definition, please mention it.

2
On

One example where you would need a weighted average comes from probability: expected value is a (not very thinly disguised) weighted average. Take the example of a raffle where 1,000 tickets are sold for 5 dollars each, with 1 prize worth 500 dollars, 1 prize worth 200 dollars, 5 prizes worth 100 dollars, and 10 prizes worth 50 dollars. The expected value of playing this lottery should just be the "average" amount you win when you play, correct?

A very naive player might say, "Well, there are five options: either I win 500, I win 200, I win 100, I win 50, or I lose 5. So when I average those out, I get $$\frac{500 + 200 + 100 + 50 - 5}{5} = 169 \text{ dollars every time I play!}$$

This player is a rafflemaker's dream, clearly. And their failure to appropriately weight these outcomes by their actual frequency, is why their answer is so far off. Let's instead count each of these outcomes as often as they actually happen, and also account for the fact that in every case, a player loses the original 5 dollars they bought their ticket with. In every 1000 plays, on average:

  • you expect to net 495 dollars once (500 dollars, minus your ticket cost)
  • you expect to net 195 dollars once
  • you expect to net 95 dollars five times
  • you expect to net 45 dollars ten times
  • the other $1000 - 1 - 1 - 5 - 10 = 983$ times, you lose your initial stake of 5 dollars

So our more accurate average has 1000 terms in its numerator and 1000 in its denominator. But in that numerator, a lot of terms are repeated (for instance, $-5$ appears 983 times!) and so it's easier just to weight each possible outcome by multiplying it by its frequency, like so: $$\frac{495 * 1 + 195*1 + 95*5 + 45*10 - 5*983}{1000} = -3.3 \text{ dollars.}$$

Once we realistically account for the frequency of each option, we see that we lose about 3 dollars and 30 cents on any given play. So the weighted average turns out to be the most natural and accurate representation of our expected winnings.

2
On

Here is an example from statistics.

The table shows the sales of sugar (in kilograms) during $10$ days: $$\begin{array}{c|c|c} \text{Sales of sugar (in kg)}, x & \text{Number of days}, f & \text{Percentage of days}, P(x)\\ \hline 0&1&0.1\\ 1&3&0.3\\ 2&4&0.4\\ 3&2&0.2\\ \hline &10&1 \end{array}$$ On $3$ days (or during $30\%$ of the $10$-day period) $1$ kilo of sugar was sold each day. Now we need to find the average sales during the $10$-day period.

Method 1. Convert the table data to raw data. Let's assume the following sales took place each day: $$3,0,3,2,2,1,3,1,1,2$$ So, the average sale is: $$\frac{\sum x}{n}=\frac{3+0+2+2+2+1+3+1+1+2}{10}=1.7$$ Method 2. Let's simplify the above expression: $$\frac{\sum x}{n}=\frac{0+1+1+1+2+2+2+2+3+3}{10}=\\ \frac{0\cdot 1+1\cdot 3+2\cdot 4+3\cdot 2}{10}=\\ 0\cdot \frac{1}{10}+1\cdot \frac{3}{10}+2\cdot \frac4{10}+3\cdot \frac{2}{10}=\\ 0\cdot 0.1+1\cdot 0.3+2\cdot 0.4+3\cdot 0.2=1.7$$ So, the sales figures are elements (x) and the percentage of days (P(x)) are weights. The more percentage a partiular sales figure occurs, the more its impact on the average sales figure.