How to show $f(z) = \sum_{j=1}^m \min_{\ell = 1, \cdots, q}\| x^j - z^\ell \|$ is NOT convex on $\mathbb{R}^n \times \cdots \times \mathbb{R}^n$

121 Views Asked by At

Let $n, m \in \mathbb{N}$, $q \geq 2$, and $x^j \in \mathbb{R}^n$ for $j=1, \cdots, m$.

How can we show that the following function $f$ is not convex? $$ f : \mathbb{R^{nq}} = \mathbb{R}^n \times \cdots \times \mathbb{R}^n \to \mathbb{R} \\ f(z) = f(z^1, \cdots, z^q) := \sum_{j=1}^m \min_{\ell = 1, \cdots, q}\| x^j - z^\ell \| = \left\| \begin{pmatrix} \min_{\ell = 1, \cdots, q} \| x^1 - z^\ell \| \\ \vdots \\ \min_{\ell = 1, \cdots, q} \| x^m - z^\ell \| \end{pmatrix} \right\|_1 $$


My work/thoughts so far:

  • The problem is the $\min$ functions, since $f$ would be convex on $\mathbb{R}^{nq}$ if the the $\ell(j)$ were already known (since we'd just have a sum of convex functions).

  • I've shown that $f$ is not complex under certain simplifying circumstances. Namely, if $x^j = 0$ for $j= 1, \cdots, m$ and we let $n=1$, $q = 2$. Then we can take $z := (6,8)^T$, $w := (4,2)^T$, and $\lambda := 1/2$, which yields $$ f(\lambda w + (1- \lambda) z) = 5m > 4m = \lambda f(w) + (1-\lambda) f(z). $$ That is, I've shown (Assumptions $\nRightarrow$ $f$ convex). But that's not the same as showing (Assumptions $\Rightarrow$ $f$ NOT convex), which is the goal.

  • Also, using the reverse triangle inequality, I can show that for all $\lambda \in (0,1)$, $z = (z^1, \cdots, z^q),w= (w^1, \cdots, w^q) \in \mathbb{R}^{nq}$ $$ f(\lambda z + (1-\lambda)w) \geq \lambda f(z) + (\lambda - 1) f(w) $$ But, that's also not what I want.

Provenance of exercise:

  • This problem was left as an exercise to the reader in an optimization text I'm reading (Grundzüge der Globalen Optimierung by Stein).
  • It's part of an ongoing example about finding $q$ cluster centers (the $z^j$, $j=1, \cdots, q$) among among a data cloud (the $x^j$, $j=1, \cdots, m$). Here $f$ is reasonable choice for the target function that we'd like to minimize.
2

There are 2 best solutions below

1
On BEST ANSWER

Let $Z\in\mathbb R^n$ be a point such that $\|x^j-Z\|,\|x^j-2Z\|>\|x^j\|$ for each $j,$ for example $Z=(1+2\max_j\|x^j\|,0,0,\dots,0).$ Then

$$f(0,2Z,2Z,\dots,2Z)=\sum_{j=1}^m\|x^j\|$$ and $$f(2Z,0,0,\dots,0)=\sum_{j=1}^m\|x^j\|$$ but $$f(Z,Z,Z,\dots,Z)>\sum_{j=1}^m\|x^j\|$$ which contradicts midpoint convexity of $f.$

I came to this argument by first considering the case where all the $x^j$ are equal, then generalizing.

0
On

Yesterday I again lost Internet connection, so I wrote my answer offline and didn’t see Dap’s answer. It is a bit similar but more simple so I vote to award him by the bounty.

Your example can be generalized for the general case. Namely, take $M=1+\max_{j=1,\dots,m}\|x^j\| $ and put $z:=2M(6,8,\dots,8)^T$, $w:=2M(4,2,\dots,2)^T$, and $\lambda=1/2$. Then for each $j=1,\dots,m$ we have

$$\min_{\ell=1,\dots, q}\|x^j-z^\ell\|\le \min_{\ell=1,\dots, q}\|x^j\|+\|z^\ell\|= \|x^j\|+\min_{\ell=1,\dots, q}\|z^\ell\|\le M-1+12M<13M.$$

So $f(z)<13Mm$. Similarly $f(w)<5Mm$. So $\lambda f(w)+(1−\lambda)f(z)<9Mm$. On the other hand, if $v=\lambda w+(1−\lambda)z=2M(5,5,\dots,5)$ then for each $j=1,\dots,m$ we have

$$\min_{\ell=1,\dots, q}\|x^j-v^\ell\|\ge \min_{\ell=1,\dots, q}\|v^\ell\|-\|x^j\|= -\|x^j\|+\min_{\ell=1,\dots, q}\|v^\ell\|\ge 1-M+10M>9M.$$

So $$f(\lambda w+(1−\lambda))=f(v)>9Mm>\lambda f(w)+(1−\lambda)f(z).$$