Are these two optimization problems equal?

118 Views Asked by At

The first optimization model: $$ \begin{array}{cl} \arg \min \limits_{C} & \sum\limits_{i=1}^{3}\gamma_i\|{C_{(i)}}\|_*\\ \mathrm{s.t.} & \|A\mathbf{c}-\mathbf{b}\|_2^2+ \mathbf{c}^TG\mathbf{c}+2\mathbf{g}^T\mathbf{c}\le tol \end{array} $$

The second optimization model: $$ \begin{array}{cl} \arg \min \limits_{C} & \sum\limits_{i=1}^{3}\gamma_i\|{C_{(i)}}\|_*+\lambda(\|A\mathbf{c}-\mathbf{b}\|_2^2+ \mathbf{c}^TG\mathbf{c}+2\mathbf{g}^T\mathbf{c}) \end{array} $$ where $C$ is a three-order tensor, $C_{(i)}$ is a matrix whose column are the mode-i fibers of $C$,(i=1,2,3),$\mathbf{c}$ is vec$(C)$, $G,\mathbf{g}$ are given,$\lambda$ is a parameter.

My question: are these two problems equivalent to each other?

2

There are 2 best solutions below

2
On

You can make the second one almost equivalent to the first one, by modifying it slightly ($\lambda$ becomes a variable instead of a fixed parameter) \begin{array}{cl} \min \limits_{C}\max \limits_{\lambda} & \sum\limits_{i=1}^{3}\gamma_i\|{C_{(i)}}\|_*+\lambda(\|A\mathbf{c}-\mathbf{b}\|_2^2+ \mathbf{c}^TG\mathbf{c}+2\mathbf{g}^T\mathbf{c}-tol). \end{array} Intuitively, the $\max_{\lambda}$ makes sure that $C$ satisfies the constraint $$\|A\mathbf{c}-\mathbf{b}\|_2^2+ \mathbf{c}^TG\mathbf{c}+2\mathbf{g}^T\mathbf{c}\le tol.$$ In the context of constrained optimization, $\lambda$ is called the Lagrange mulitplier and you usually solve the problem by switching the order of $\min$ and $\max$, that is, to solve \begin{array}{cl} \max \limits_{\lambda} \min \limits_{C} & \sum\limits_{i=1}^{3}\gamma_i\|{C_{(i)}}\|_*+\lambda(\|A\mathbf{c}-\mathbf{b}\|_2^2+ \mathbf{c}^TG\mathbf{c}+2\mathbf{g}^T\mathbf{c}-tol), \end{array} related to duality. The reason for "almost equivalent" is that the sets under the constraint could be empty, and obviously $$ \max \limits_{\lambda} \sum\limits_{i=1}^{3}\gamma_i\|{C_{(i)}}\|_*+\lambda(\|A\mathbf{c}-\mathbf{b}\|_2^2+ \mathbf{c}^TG\mathbf{c}+2\mathbf{g}^T\mathbf{c}-tol) = +\infty.$$ For this reason, it is better to use $\sup$ and $\inf$, instead of $\max$ and $\min$.

0
On

They are not equivalent, no. But they represent two different ways to study the same multi-objective problem: \begin{array}{ll} \text{minimize} & \left( \sum_{i=1}^3 \gamma_i \|C_{(i)}\| ,\|A\mathbf{c}-\mathbf{b}\|_2^2+ \mathbf{c}^TG\mathbf{c}+2\mathbf{g}^T\mathbf{c} \right) \end{array} The idea here is that you're trying to minimize both things simultaneously in some vague sense. Of course, that you can't solve anything "in a vague sense"---you have to make your notion more explicit.

Mathematically, you have to somehow reduce the model to a single objective. The simplest approach is to solve a problem with a weighted sum of the objectives; that is, \begin{array}{ll} \text{minimize} & \lambda_1 \sum_{i=1}^3 \gamma_i \|C_{(i)}\| + \lambda_2 \left(\|A\mathbf{c}-\mathbf{b}\|_2^2+ \mathbf{c}^TG\mathbf{c}+2\mathbf{g}^T\mathbf{c} \right) \end{array} for some fixed $\lambda_1,\lambda_2\geq 0$. By varying $\lambda_1,\lambda_2$, you can study the tradeoff between the two objectives. Note however that the value of $C$ obtained for a specific pair $(\lambda_1,\lambda_2)$ doesn't change if you scale both weights by the same positive value $\alpha>0$. All that matters therefore is the ratio $\lambda_2/\lambda_1$; the larger this ratio, the more the second objective is favored over the first. Because the absolute scaling is irrelevant, it is common to set one of the weights to $1$. That's exactly what is happening with your second formulation above.

Now, what about your first formulation? That is just an alternative way to study the multiobjective problem. For a fixed value of $tol$ (assuming it is large enough so that the model is feasible), the value of $C$ that is obtained will be exactly equal to the value of $C$ obtained from the second model for a certain value of $\lambda$.

We generally cannot say which value of $\lambda$ it corresponds to, just that there is one. And we know that driving $\lambda_2/\lambda_1\rightarrow+\infty$ moves us in the same direction along the tradeoff curve as driving $tol\rightarrow 0$. Conversely, driving $tol\rightarrow+\infty$ moves us in the same direction along the tradeoff curve as driving $\lambda_2/\lambda_1\rightarrow 0$.

So these models are not really "equivalent", but they do help us study the same underlying multiobjective problem. As for which one is easiest to solve, that depends on the specific software being used.