How to compute primal variable based on dual variables and their multipliers

1.5k Views Asked by At

I edited this question based on information I got from comments. Assume we have an optimization problem (primal problem). we solve it's dual using some kind of primal-dual interior point solver. So, we have multipliers of constraints of the dual or any other relevant information. Knowing that dual of the dual is equal or related to primal, how can we this information to find the primal variable as efficient at possible? Is there any method that can use this information to find the primal variable cheaper than directly solving it or no? to be more clear consider the next few lines. Consider the following optimization problem \begin{align} \min_{\alpha} &f(\alpha)\\ \textrm{s.t.} &-p \leq \alpha \leq r\notag\\ \end{align} where $f(\alpha)$ is a convex function. It's lagrangian is \begin{align} &\mathcal{L}=f(\alpha)-\eta^\mathsf{T} (\alpha+p)+\beta^\mathsf{T} (\alpha-r)\notag\\ \end{align} Dual of this problem is a problem in the form \begin{align} \min_{\eta,\beta} &g(\eta,\beta) \\ \textrm{s.t.} & \eta >= 0 \\ & \beta >=0 \end{align} Now assume we have a solver which solves this problem. If we assume that solver gives as output $\eta^*$,$\beta^*$ and also multipliers of the constriants $\eta>=0$ and $\beta>=0$ say $\nu$ and $\mu$ (at the optimality), how can we compute $\alpha^*$ if strong duality holds using those values efficiently?

To make it clearer, consider a simple quadratic (convex) optimization problem like \begin{align} \min_{\alpha} &c \alpha^\mathsf{T} A \alpha-d^\mathsf{T} \alpha\\ \textrm{s.t.} &-p \leq \alpha \leq r\notag\\ \end{align} where $A$ is a positive semi-definite matrix. It's lagrangian is \begin{align} &\mathcal{L}=c \alpha^\mathsf{T} A \alpha-d^\mathsf{T} \alpha -\eta^\mathsf{T} (\alpha+p)+\beta^\mathsf{T} (\alpha-r)\notag\\ \end{align} Now assume we have answers of the dual optimization problem which is an optimization problem with respect to variables $\eta>=0$ and $\beta>=0$,i.e. \begin{align} \min_{\beta,\eta} &\frac{1}{2c} (d+\eta-\beta)^\mathsf{T}A^{\dagger}(d+\eta-\beta)+\eta^\mathsf{T} p +\beta^\mathsf{T} r\notag\\ \textrm{s.t.} & \beta \geq 0,\\&\eta \geq 0\notag\\ \end{align} If we have $\eta$ ,$\beta$ and the lagrange multipliers for constraints $\eta >=0$ and $\beta>=0$, say $\nu$ and $\mu$ respectively, How to compute primal variable $\alpha$ based on $\eta,\beta,\nu,\mu$, when strong duality holds? Note: For some reasons, I don't want to use pseudo inverse of $A$ in this case .

Comment: According to an answer by @Michael, the above dual is not completely correct, please see his answer.

2

There are 2 best solutions below

4
On BEST ANSWER

I assume $A$ is symmetric.

Your dual manipulation is incorrect when $A$ does not have full rank. To see why, let's try to compute the dual function (using your notation):

\begin{align} g(\eta, \beta) &= \sup_{\alpha \in \mathbb{R}^k} [-c\alpha^TA\alpha + d^T\alpha+\eta^T(\alpha + p) - \beta^T(\alpha -r)] \\ &= \sup_{\alpha \in \mathbb{R}^k} [ -c\alpha^TA\alpha + (d+\eta-\beta)^T\alpha ] + \eta^Tp + \beta^Tr \end{align}

Suppose $A$ does not have full rank. Take any nonzero vector $z \in Null(A)$. It follows that: $$ -cz^TAz +(d+\eta-\beta)^Tz = (d+\eta-\beta)^Tz$$ Then if $(d+\eta-\beta)$ is not orthogonal to $z$, the vector $z$ can be scaled by either $M$ or $-M$ (for arbitrarily large real numbers $M$) to ensure the above expression $\rightarrow\infty$. Thus: $$ g(\eta, \beta) = \infty \:\: \mbox{if $(d+\eta-\beta) \notin Null(A)^{\perp}$} $$ But $Null(A)^{\perp} = Col(A^T) = Col(A)$ (by symmetry of $A$). Thus, the $g(\eta, \beta)$ function is only finite when $(d+\eta-\beta) \in Col(A)$. Minimizing the $g(\eta, \beta)$ function thus requires the additional constraint $(d+\eta-\beta) \in Col(A)$, which is not considered in your dual problem. Your dual problem only considers the constraints $\eta \geq 0$, $\beta \geq 0$, which is not enough.


The column space constraint can be enforced by adding two constraints to the dual formulation, as follows: \begin{align} &d + \eta - \beta = Aw \\ & w \in \mathbb{R}^k \end{align}


Now suppose you obtain an optimal solution $\eta^*$ and $\beta^*$ for the dual. Then an optimal primal solution $\alpha^*$ must be a solution to the primal maximization $\sup_{\alpha\in\mathbb{R}^k}[-c\alpha^TA\alpha + (d+\eta^*-\beta^*)^T\alpha]$. Thus, it is a solution to: $$ A\alpha = \frac{1}{c}(d+\eta^* - \beta^*)$$ This has a solution because we already know $d+\eta^*-\beta^*\in Col(A)$ (this was enforced by the dual problem). Unfortunately, there are an infinite number of solutions to the above equation. A pseudo-inverse can give you a particular solution. This particular solution is not guaranteed to satisfy the desired constraints of the primal. For any particular solution $\alpha^p$, the set of all solutions is $\alpha^p + Null(A)$. So, armed only with this information, you can only say that an optimal primal $\alpha^*$ satisfies: $$ \alpha^* \in \alpha^p + Null(A) $$

4
On

The process of minimizing $\mathcal{L}$ to construct the dual results in a formula for $\alpha$ as a function of $\beta$ and $\eta$. This is exactly the connection between the optimal primal and dual variables.