Estimating the mean with least median of squares target

437 Views Asked by At

I have a set of real numbers $x_1,\ldots,x_n$. I would like estimate the mean of the set, $y$, such that the median of squares will be minimal : $\operatorname{median}(x_i-y)^2$

Is there a closed form formula for this problem or a series of steps to finding $y$? How do I prove it provides the minimum for this function?

To stress out - I'm not estimating really - I need an accurate answer that minimizes the median of the square distances.

Thanks

2

There are 2 best solutions below

0
On

I'd be willing to bet that there is no closed form solution to this problem since it involves sorting. I don't think there is an algorithm involving some fixed number of steps either. However, it would usually be possible to do this using numerical methods.

Here's an example using the Python scipy software.

sample = [ 3, 5, 9, 10, 13, 17, 19, 22 ]

from scipy.optimize import minimize_scalar

def f ( x ) :

global sample

squ_dist = [ pow ( s - x, 2 ) for s in sample ]

if len ( sample ) % 2 == 0 :

return (squ_dist[len(squ_dist)/2]+squ_dist[len(squ_dist)/2+1])*0.5

else :

return squ_dist[len(squ_dist)/2]

res = minimize_scalar ( f )

res.x

14.999999999999993

... meaning that 15 would be the estimate for the mean in this sample. The software is using Brent's method of optimisation to get a minimum.

0
On

Consider for simplicity that $n$ is odd, and let $m=(n+1)/2$; assume also that the values are all distict and sorted. Now, the values $(x_i -y)^2$ have the same ordering as $|x_i -y|$, and problem is equivalent to minimize the median of $|x_i -y|$

Suppose that we pick a tentative $y$. Let $A(y) \subset \{1,2 \cdots n\}$ be the set of indexes of the $m$ nearest neighbours of $y$; let $j \in A(y)$ be the index of the farthest element, so that $z =|x_j - y|$ is our target value to minimize. In general, we should be able to improve our initial $y$ moving it towards $x_j$, as long as $A(y)$ does not change, nor $j$. $A(y)$ will change when there is a tie, between the farthest and the next-to-farthest element. The minimum must occur at one such point. Then, our search for the minimum must check only $(n+1)/2$ alternatives. Assuming the list is ordered, then:

$$y=\min \frac{x_i+x_{i+m-1}}{2}$$

Two examples

enter image description here

enter image description here

What's is left is to attack the even $n$ case, and the case of repeated values.