Minimizing a function related to "The Median Minimizing the Sum of Absolute Deviations"

230 Views Asked by At

The function $f:\mathbb{R}^n\rightarrow\mathbb{R}$ to minimize has the following form:

$$f(x)=\displaystyle\sum_{i=1}^n\sum_{j=1}^n|s_{ij}-x_ix_j|$$

where the $s_{ij}$'s are given real numbers between $0$ and $M>0$.

In order to find the least value of $f$, I want to apply a grid search by discretizing every variable $x_i$.

My question is the following: in order to restrict my grid search, is it possible to find a lower and an upper bound on $x_i^{\ast}$ for an optimal solution $x^{\ast}$?

I suspect that we have $0\leq x_i^{\ast} \leq M$ or maybe even better $0\leq x_i^{\ast} \leq \sqrt{M}$ but I am not able to prove it.

It is clear that for the well-studied one-dimensional function $f(x)=\displaystyle\sum_{i=1}^n|s_{i}-x|$, we have $\min_i(s_i)\leq x^{\ast} \leq \max_i(s_i)$ since $x^{\ast}$ is the median of the $s_i$'s.

Thank you very much!

1

There are 1 best solutions below

0
On

Partial Answer: Note that $\dfrac{d}{dx}|x|=\operatorname{sgn}(x),$ the signum function: $$\operatorname{sgn}(x)=\begin{cases}-1,&x<0\\0, &x=0\\ +1,&x>0\end{cases}. $$ We do not need to worry overmuch about the lack of differentiability at the origin.

Let $1\le k\le n.$ We can rewrite $f$ carefully as \begin{align*} f(x)&=\sum_{i\not=k}\left[\sum_{j\not=k}|s_{ij}-x_ix_j|+|s_{ik}-x_ix_k|\right] +\sum_{j\not=k}|s_{kj}-x_kx_j|+|s_{kk}-x_k^2|. \end{align*} We take the partial derivative $\partial f/\partial x_k$ and simplify: \begin{align*} \frac{\partial f}{\partial x_k}&= \sum_{i\not=k}\operatorname{sgn}(s_{ik}-x_ix_k)(-x_i)+\sum_{j\not=k}\operatorname{sgn}(s_{kj}-x_kx_j)(-x_j)+\operatorname{sgn}(s_{kk}-x_k^2)(-2x_k). \end{align*} We set this equal to zero: $$\sum_{i\not=k}\operatorname{sgn}(s_{ik}-x_ix_k)x_i+\sum_{j\not=k}\operatorname{sgn}(s_{kj}-x_kx_j)x_j+2\operatorname{sgn}(s_{kk}-x_k^2)x_k=0. $$ We can simplify this a bit by noting that the last term can be added to the first two sums, counting the $s_{kk}$ term twice, as needed: $$\sum_{i=1}^n\operatorname{sgn}(s_{ik}-x_ix_k)x_i+\sum_{j=1}^n\operatorname{sgn}(s_{kj}-x_kx_j)x_j=0. $$ Now, there are $n$ of these equations, so you might be able to reason from that. (Note that this same sort of derivation shows the minimum of the one-dimensional function you quoted to be at the median.) On the other hand, I'm not sure we're guaranteed that all the equations will hold at the minimum.

Hopefully, this will give you an idea, but I'm not sure where to go from here, or even if this is a fruitful line of inquiry.