Extension of median minimizing the sum of absolute deviations (the $L_1$ norm)

403 Views Asked by At

This is an extension of the question asked in The Median Minimizes the Sum of Absolute Deviations (The $ {L}_{1} $ Norm) . Except with the extra constraint that $x \in S$.

The solutions provided there doesn't seem to work when the cardinality of the set is even (if it is odd, the solution is the median again). I'm particularly interested in the optimization approach (the one in Royi's answer).

With the above constraint, we now have a constraint optimization problem rather than an unconstrained problem:

$$ \arg \min_x \sum_{i=1}^n |s_i - x| \\ \text{subject to: } x \in S $$

I'm pretty certain that the solution to this problem, when the size of the set is even, is that $x$ can either be the $floor(n/2)$-th or $ceiling(n/2)$-th element of the SORTED set, where $n$ is the size of the size. For example, say $S = \{1,2,3,4,5,6,7,8\}$, so $n=8$. I believe $x$ can be 4 or 5 in this case.

Using Royi's approach, I am not sure how to account for the constraint. Would lagrange multipliers work for a constraint like $x \in S$?

1

There are 1 best solutions below

2
On

The item which minimizes the function is called the Medoid with respect to the distance by the absolute deviation in your case.

For scalars (Namely the case $ {s}_{i} \in \mathbb{R} $) the solution can be found in O(n) by the Quick Select Algorithm.

In order to achieve a formal solution I'd go these steps:

  1. The Objective Function Is Convex
    The objective function is a convex function hence there is a unique minimum value (Not argument).
  2. The Objective Function Is Symmetric
    The objective function is symmetric around its minimizer(s).
    Namely being away from the minimizer by s > t means the value for t is better.

Integrating (1) and (2) means that we can use the solution with no constraints (Median) and the find the closest one to it. On a sorted list it becomes obvious.