Solve $\min_{x} \frac{1}{N}\sum_{i=1}^{N} f_i(x_i) + g(x)$ $\ {\rm s.t.} Ax = b$; $x = [x_1,\ldots,x_N]^T$ and $A \in \mathbb{R}^{M \times N}$.

135 Views Asked by At

An optimization problem: \begin{equation*} \begin{aligned} & \underset{ x \in \mathbb{R}^N }{\text{minimize}} & & \frac{1}{N} \sum_{i=1}^{N} f_i\left(x_i\right) + g(x) \\ & \text{subject to} & & A x = b \ , %&&& X \succeq 0. \end{aligned} \end{equation*} where $x = [x_1,\ldots,x_N]^T$, $A \in \mathbb{R}^{M \times N}$, and $f_i$ is $m$ strongly convex and $L$ smooth, and $g(x)$ is twice differentiable.

Question(s):

  • How to solve such problem, say if $N \geq 10^{10}$? Can we use second-order method, e.g., Newton's method, for such large scale?
  • (But I think for $N \leq 5000$ we can use Newton like method, right?)

I am sure there must be tons of methods to solve such problems. But I would be happy to hear few methods from you experts for my learning purpose.

ADD: Constraint that $M \leq N$. So, $A$ is a fat matrix.

2

There are 2 best solutions below

7
On

Maybe solving this problem in two steps?
1) Add the "fictive" constraint $g(x) = y$ to your original problem and obtain the conditional solution $x^*(y)$.
EDIT: This gives:
\begin{equation*} \begin{aligned} & \underset{ x \in \mathbb{R}^N }{\text{minimize}} & & \frac{1}{N} \sum_{i=1}^{N} f_i\left(x_i\right) + y \\ & \text{subject to} & & A x = b \text{ and } g(x) = y. \end{aligned} \end{equation*} which yields $x^*(y)$. Note that $A x^*(y) = b$ and $g(x^*(y))=y.$
2) Then, relax the "fictive" constraint and minimize over $y$: \begin{equation*} \begin{aligned} & \underset{ y \in \mathbb{R} }{\text{minimize}} & & \frac{1}{N} \sum_{i=1}^{N} f_i\left(x_i^*(y)\right) + y \\ \end{aligned} \end{equation*}

This yields the overall solution(s) $x^{**}$, which also satisfy(ies) the requirement $y=g(x^{**})$.
For existence and unicity of the solution, however, it would be helpful to know a bit more about $f$ and $g$ (are they monotone, convex, single peaked?)
Of course, the equivalence between the one step and two step optimization procedure does not work in all situations. But under suitable assumptions on $f$ and $g$, they are equivalent.

0
On

Your case fits General consensus ADMM with regularization as in Ryan Tibshirani - Dual Methods and ADMM (See slide 27):

enter image description here

In you case you should convert the constraint into an indicator function. Then use it as part of the $ {f}_{i} $ functions.

In you case $ \boldsymbol{a}_{i] = \boldsymbol{e}_{i} $ which will make things easier.

If you want to be even more efficient, you may use General Form Consensus Optimization from Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers (See page 53):

enter image description here

In this case the fact each function is a scalar function will be used to make things more efficient.

If you share specific functions ($f, g$) then I will try to implement.