Why geometric median cannot be solved analytically

806 Views Asked by At

$\DeclareMathOperator*{\argmin}{argmin}$ For a given set of $m$ points $x_1,...,x_m$ with each $x_i\in \mathbb{R}^n$, the geometric median (or the weber point) is defined as $$\argmin\limits_{y \in \mathbb{R^n}} \sum_{i=1}^{m} \|x_i-y \|_2 $$ I am asking for an explanation for why this optimization problem cannot be solved analytically. Also, I am asking for a explanation about how to solve it numerically.

1

There are 1 best solutions below

2
On BEST ANSWER

Asking the question "why this optimization problem cannot be solved analytically" seems to me problematic. For one thing, I'm unaware if anyone has even attempted to prove that a problem of this form cannot be solved analytically. [EDIT: I am now! See Semiclassical's link in the comment below; it does happen to be proven in this particular case that no analytic solution exists for $m\geq 5$.] But that seems like a rather odd thing to attempt to prove anyway. Rather, it ought to satisfy you simply that we don't know how to solve it analytically. If you or someone else discovers a way, great! Publish it! But short of that, asking "why" seems akin to confusing an absence of evidence with evidence of absence.

Furthermore, it really should not be surprising that any given optimization model fails to admit an analytic solution. Problems that do are the exception; and if you find one, you had better not want to change it in any way lest you break that precious property. For example, we can solve a pure least squares problem by solving a single linear system in $O(n^3)$ operations. But the minute we add nearly any complicating factor---for example, selecting a different norm than the $\ell_2$ norm, adding an $\ell_1$ regularizer, bounds on the variables, etc.---we can no longer rely on our analytic approach.

As for the problem itself, it is a rather simple form of a second-order-cone program. A variety of solvers out there can handle such a problem reliably. If you're willing to use MATLAB, my software CVX make specifying the problem simple. If we assume that the vectors $x_i$ are stacked into an $n\times m$ matrix X, the model is simply

cvx_begin
    variable y(n)
    minimize(sum(norms(X-y(:,ones(1,m))))
cvx_end