Optimization problem where the objective function not differentiable

190 Views Asked by At

$$\min_{x\in \mathbb{R}^n}\max_{i=1 \dots m}\{\langle\nabla f_i({\overline{x}}),x-\overline{x}\rangle\}+\sigma\|x-\overline{x}\|^2 \tag 1,$$

where $F : \mathbb{R}^n \to \mathbb{R}^m$, $F(x)=(f_{1}(x), \dots,f_{m}(x))$. I tried to solve this problem, but could not finish it. Could someone please help me?


This is part of my solution:

Solve problem (1) is equivalent to solve \begin{align} \min\limits_{x,z}&\quad z&\hspace{4cm} (2)& \\ \text{s.t}& \quad \langle \nabla f_i({\overline{x}}),x-\overline{x}\rangle+\sigma\|x-\overline{x}\|^2 \leq z&& \end{align} the Langragian dual problem associated with $(2)$ is:

$$\max_{\mu\geq 0}q(\mu)\tag 3,$$ with $$q(\mu)=\inf L(x,z,\mu)$$ and $$L(x,z,\mu)= z+\sum_{i=1}^m\mu_i(\langle \nabla f_i({\overline{x}}),x-\overline{x}\rangle+\sigma\|x-\overline{x}\|^2-z)$$. the problem (3) is equivalent to solve $$\max_{\mu\geq 0} \min_{x,z}\left(1-\sum\limits_{i=1}^m \mu_i\right)z + \sum \limits_{i=1}^m \mu_i(\langle\nabla f_i(\overline{x}),x-\overline{x}\rangle + \sigma\|x-\overline{x}\|^2), $$ $$(x^\ast,z^\ast)=\operatorname*{arg min}_{x,z} \left(1-\sum\limits_{i=1}^m \mu_i\right)z+\sum_{i=1}^m \mu_i(\langle \nabla f_i(\overline{x}),x-\overline{x}\rangle + \sigma\|x-\overline{x}\|^2),$$ $$\nabla_{x,z} L(x^\ast,z^\ast,\mu^\ast)=0$$ this is $$\begin{pmatrix} \sum\limits_{i=1}^m\mu_i^\ast\nabla f_i(\overline{x}) + \sum\limits_{i=1}^m 2\mu_i^\ast \sigma(x^\ast-\overline{x}) \\ 1-\sum\limits_{i=1}^m\mu_i^\ast \\ \end{pmatrix}=\begin{pmatrix} 0\\ 0 \end{pmatrix}$$ This problem has a solution when: $$\sum\limits_{i=1}^{m}\mu_{1}=1$$ and $$x^{\ast}=-\frac{\sum\limits_{i=1}^m\mu_i\nabla f_i(\overline{x})}{2\sigma\sum\limits_{i=1}^m\mu_i}+\overline{x}$$ $$x^{\ast}=-\frac{\sum\limits_{i=1}^m\mu_i\nabla f_i(\overline{x})}{2\sigma} + \overline{x}.$$ I do not know what else to do, what would be the solution to my original problem? and what would be the value of $z^{\ast}$?

2

There are 2 best solutions below

0
On

Let $C = \operatorname{co} \{ \nabla f_k( \bar{x} ) \}$ and observe that $C$ is a compact convex set. Note that the problem is equivalent to $\min_h \max_{\xi \in C} \langle \xi, h \rangle + \sigma \|h\|^2$.

A little work shows that a solution must lie in a compact convex set (because of the $\|h\|^2$ term) and the map $(h,\xi) \to \langle \xi, h \rangle + \sigma \|h\|^2$ is convex in $h$ and concave (in fact, linear) in $\xi$.

We can apply von Neumann's minimax theorem to exchange the $\min, \max$ to get $\max_{\xi \in C} \min_h \langle \xi, h \rangle + \sigma \|h\|^2$.

We see that the inner problem has a unique solution $\xi = -2 \sigma h$, and so the problem reduces to $\max_{\xi \in C} -{1 \over 4 \sigma} \|\xi \|^2 = - {1 \over 4 \sigma} \min_{\xi \in C} \|\xi \|^2$.

In particular, the solution is given by the unique point of minimal norm in $C$.

Adjusting $\sigma$ has an effect similar to a step size, except that increasing $\sigma$ has a similar effect to reducing the stepsize.

0
On

Let $\Delta_m$ be the $m$-simplex and note that the maximum of a linear function on $\Delta_m$ is attained on a vertex (this is elementary, and dates back to Nash). Thus $$\max_{i} \langle \nabla f_i(\bar{x}), x-\bar{x}\rangle = \max_{z \in \Delta_m}\langle \nabla F(\bar{x})^T(x-\bar{x}),z\rangle = \max_{z \in \Delta_m}\langle x-\bar{x},\nabla F(\bar{x})z\rangle. $$ Thus by Sion's minimax theorem, one has

$$ \begin{split} \min_{x \in \mathbb R^n}\max_{i} \langle \nabla f_i(\bar{x}), x-\bar{x}\rangle + \frac{\sigma}{2}\|x-\bar{x}\|_2^2 &= \min_{x \in \mathbb R^n}\max_{z \in \Delta_m}\langle x-\bar{x},\nabla F(\bar{x})z\rangle\\ &= \max_{z \in \Delta_m}\min_{x \in \mathbb R^n}\langle x-\bar{x},\nabla F(\bar{x})z\rangle + \frac{\sigma}{2}\|x-\bar{x}\|_2^2\\ &= \max_{z \in \Delta_m} \langle -\frac{1}{\sigma}\nabla F(\bar{x})z,\nabla F(\bar{x})z\rangle + \frac{\sigma}{2}\|\frac{1}{\sigma}\nabla F(\bar{x})\|_2^2\\ &= \max_{z \in \Delta_m} -\frac{1}{2\sigma}\|\nabla F(\bar{x})z\|_2^2 = -\frac{1}{2\sigma}\min_{u \in C}\|u\|_2^2, \end{split} $$

where $C := \{z_i\nabla f_i(\bar{x}) | z \in \Delta_m\} =: \operatorname{co}(\nabla F(\bar{x})$.