Average weighted by inverse distance to median equal to median?

465 Views Asked by At

Problem Statement

I have a set of $N$ ordered elements such that $x = \{x_1, x_2, ..., x_q, x_p, ..., x_N\}$ where $x_q \le x_m \le x_p$ and $x_m$ is the median of the set $x$. I define a particular inverse weighted mean as

$$\langle x\rangle_I \equiv \frac{\sum_{i=0}^N \frac{x_i}{|x_i-x_m|}}{\sum_{i=0}^N \frac{1}{|x_i-x_m|}}$$

Essentially this is just a weighted mean where the weight is the inverse distance each point is from the median.

Prove that $\langle x \rangle_I = x_m$

Attempted Solution

I ran across this problem because I found someone who had done such a weighted average and realized they were only getting back the median of their set. I then quickly showed with a few scripts that every example I could generate showed the above statement to be true. So I endeavored to rigorously prove it.

Case N is odd

My first approach was to consider the case when $N$ was odd. This case, I think is trivial since one of the terms, say $x_j$, is actually the median, in which case you get that this average becomes

$$\langle x\rangle_I = \frac{\lim_{x_j\to x_m} \frac{x_j}{x_j-x_m} + \epsilon}{\lim_{x_j\to x_m} \frac{1}{x_j-x_m} + \epsilon}$$

where $\epsilon$ is the sum of all the remaining (insignificant) terms and thus they can be ignored. Then you have

$$\langle x\rangle_I = \frac{\lim_{x_j\to x_m} \frac{x_j}{x_j-x_m}}{\lim_{x_j\to x_m} \frac{1}{x_j-x_m}} = \lim_{x_j\to x_m}\frac{\frac{x_j}{x_j-x_m}}{\frac{1}{x_j-x_m}} = \lim_{x_j\to x_m} x_j = x_m$$

General case

Here is where I run into some difficulties. I don't think I can rely on the same tricks as I did when $N$ was exclusively odd. I can do the following chain though.

$$\langle x\rangle_I = \frac{\sum_{i=0}^N \frac{x_i}{|x_i-x_m|}}{\sum_{i=0}^N \frac{1}{|x_i-x_m|}} = x_m \frac{\sum_{i=0}^N \frac{x_i/x_m}{|x_i-x_m|}}{\sum_{i=0}^N \frac{1}{|x_i-x_m|}}$$

If the statement $\langle x\rangle_I = x_m$ is true, then from the last step the fraction must be unity and then following must be true.

$$\sum_{i=0}^N \frac{x_i/x_m}{|x_i-x_m|} = \sum_{i=0}^N \frac{1}{|x_i-x_m|}$$

It is at this point where I'm not sure where to go. I've written out this for a few specific cases ($N = 2, 3, ...$ etc.) and it was true for those cases. What's more generating random data sets and running a few quick scripts shows that these two terms are equal. I just want the rigorous proof now. I did think to remove the denominators by getting a common denominator in every term. I can do that by the following.

$$\sum_{i=0}^N \Big(\frac{x_i/x_m}{|x_i-x_m|} \prod_{j\ne i}\frac{|x_j-x_m|}{|x_j-x_m|}\Big) = \sum_{i=0}^N \Big(\frac{1}{|x_i-x_m|} \prod_{j\ne i}\frac{|x_j-x_m|}{|x_j-x_m|}\Big)$$

$$\sum_{i=0}^N \frac{x_i/x_m \prod_{j\ne i}|x_j-x_m|}{\prod_{j=0}^N|x_j-x_m|} = \sum_{i=0}^N \frac{\prod_{j\ne i}|x_j-x_m|}{\prod_{j=0}^N|x_j-x_m|}$$

Now I cancel the denominators and I'm left with

$$\sum_{i=0}^N \frac{x_i}{x_m} \prod_{j\ne i}|x_j-x_m| = \sum_{i=0}^N \prod_{j\ne i}|x_j-x_m|$$

I don't know if this statement is easier to prove that the one above. In any case, I've reached a point where I'm not sure on the best course of action. Can anyone point me towards a proof?

Point to note

I'm fairly certain the problem statement is only true if $x_m$ is the median of the set. If it is any other number, it is easy to show the statement is not true because you can readily find a counter-example. That implies to me that one must use the fact that $x_m$ is the median in the proof.

1

There are 1 best solutions below

0
On BEST ANSWER

After working at this for a few days, I think I got a good proof. To start, the proof for the case of $N$ being odd still holds. Rather than trying to prove the general case, I found a proof for the case of $N$ being even. I will start with the result at the end of the question.

$$\sum_{i=1}^N \frac{x_i}{x_m} \prod_{j\ne i}|x_j-x_m| = \sum_{i=1}^N \prod_{j\ne i}|x_j-x_m|$$

From here, to help see the next step, I will write out the sum and product terms in a general way.

$$\begin{multline} \begin{split} &\frac{x_1}{x_m}(x_m-x_2)...(x_m-x_q)(x_p-x_m)...(x_N-x_m)+\\ &\frac{x_2}{x_m}(x_m-x_1)(x_m-x_3)...(x_m-x_q)(x_p-x_m)...(x_N-x_m)+...\\ &\frac{x_q}{x_m}(x_m-x_1)...(x_m-x_{q-1})(x_p-x_m)...(x_N-x_m)+\\ &\frac{x_p}{x_m}(x_m-x_1)...(x_m-x_q)(x_{p+1}-x_m)...(x_N-x_m)+...\\ &\frac{x_N}{x_m}(x_m-x_1)...(x_m-x_q)(x_p-x_m)...(x_{N-1}-x_m)\\ &=\\ &(x_m-x_2)...(x_m-x_q)(x_p-x_m)...(x_N-x_m)+\\ &(x_m-x_1)(x_m-x_3)...(x_m-x_q)(x_p-x_m)...(x_N-x_m)+...\\ &(x_m-x_1)...(x_m-x_{q-1})(x_p-x_m)...(x_N-x_m)+\\ &(x_m-x_1)...(x_m-x_q)(x_{p+1}-x_m)...(x_N-x_m)+...\\ &(x_m-x_1)...(x_m-x_q)(x_p-x_m)...(x_{N-1}-x_m) \end{split} \end{multline}$$

Now to examine the left hand side of this equation, careful inspection shows some terms that can conveniently cancel. To see this, use the fact that $N$ is even and each term in the sum pairs up with another term. The first term (the $x_1/x_m$ term) pairs with the last term (the $x_N/x_m$ term), the second term ($x_2/x_m$) pairs with the second to last ($x_{N-1}/x_m$), etc. Note that for the first pair, you can distribute $x_1/x_m$ into the last term in the product and $x_N/x_m$ can distribute into the first term in the project. If you look at just this first pair, you get the result

$$\begin{multline} \begin{split} &\frac{x_1x_N}{x_m}(x_m-x_2)...(x_m-x_q)(x_p-x_m)...(x_{N-1}-x_m)+\\ &-x_1(x_m-x_2)...(x_m-x_q)(x_p-x_m)...(x_{N-1}-x_m)+...\\ &-\frac{x_1x_N}{x_m}(x_m-x_2)...(x_m-x_q)(x_p-x_m)...(x_{N-1}-x_m)+\\ &x_N(x_m-x_2)...(x_m-x_q)(x_p-x_m)...(x_{N-1}-x_m)\\ &=\\ &... \end{split} \end{multline}$$

Note how two of the terms cancel here. In each pair of terms, there will be a set of terms which cancels. After canceling the relevant terms, you get the following result.

$$\begin{multline} \begin{split} &-x_1(x_m-x_2)...(x_m-x_q)(x_p-x_m)...(x_{N-1}-x_m)+...\\ &x_N(x_m-x_2)...(x_m-x_q)(x_p-x_m)...(x_{N-1}-x_m)\\ &=\\ &... \end{split} \end{multline}$$

With this result, you can write the terms in a compact form as

$$\sum_{i=1}^N (-1)^f x_i \prod_{\substack{j\ne i\\ j\ne N+1-i}}|x_j-x_m|=...$$

where $f = 1$ if $i \le q$ and $f = 0$ if $i \ge p$.

From this point, if one applies the same process to the right hand side of the equation above listing out all the terms, you'll find you get the exact same result. Pairing and cancelling the proper terms yields the following equation

$$\sum_{i=1}^N (-1)^f x_i \prod_{\substack{j\ne i\\ j\ne N+1-i}}|x_j-x_m|=\sum_{i=1}^N (-1)^f x_i \prod_{\substack{j\ne i\\ j\ne N+1-i}}|x_j-x_m|$$

This itself is the final result. Because it was shown that both sides were equal, then we know from the question, that $\langle x \rangle_I = x_m$ is true in the case when $N$ is even. Since it is also true for when $N$ is odd, this theorem is proven completely. I know some of this proof is incomplete, but going through it will prove to you that it works.

Point to note

As stated in the problem, the solution should require that $x_m$ be the median, otherwise the mathematical statement is not true. This proof does indeed require that $x_m$ be the median because it relies on the fact that there are the same number of points in the set $x$ which are less than $x_m$ as there are which are greater than $x_m$. This necessarily implies that $x_m$ is the median.