Optimization: max to softmax for convexity?

181 Views Asked by At

Assume we have the following optimization problem: for a family of $m$ vectors $\{x_i\}\in \mathbb{R}^n$, a family of $l$ vectors $\{c_i\}\in \mathbb{R}^n$ with $l\ll m$ and for a family of $l$ variables $\{ a_i\} \in \mathbb{R}$ consider \begin{align} \arg \max_{i \in \{1,...,l\}} \quad -a_i(x-c_i)^T(x-c_i) \qquad(1) \end{align} or equivalently \begin{align} \arg \min_{i\in\{1,..l\}} \quad a_i(x-c_i)^T(x-c_i) \,\,\,\,\,\,\qquad(2) \end{align} and assume for the moment that all $a_i=1$. This can interpreted as: pick the $i$ which minimizes the distance of the vector $x$ from the known vector $c_i$. This is easy enough.

If we view the above as a Gaussian mixture, then one can set the vectors $a_i$ to belong to the interval $[0,1]$ with $$a_1+\ldots+a_l=1.$$

The question I have is how to recast (1) or (2) as a constrained optimization problem properly. One idea I read about online is that one can consider softmax as a smooth approximation to max. For a vector $y = \{y_1,...,y_l\}$, $$ {\rm softmax(y)} = \log \exp(\sum_{i=1}^l y_i) $$ Setting $-a_i(x-c_i)^T(x-c_i) \equiv y_i$, does \begin{align} {\rm softmax}(y) &= \log \exp (y_1)+\log \exp(y_2)+\ldots + \log \exp(y_l) \\ &= \log\exp[-a_1(x-c_1)^T(x-c_1)+\ldots + log\exp[-a_l(x-c_i)^T(x-c_l)] \end{align} correspond to a smooth version of (1)? And if so, can it be used to solve the optimization problem with the constraints $\sum_{i=1}^la_l=1$ and how? How would I solve this problem (in principle)? If not, where is the fallacy in my question?