Lagrangian Constrained Optimization

109 Views Asked by At

I'm trying to solve:

\begin{align} \min_{\alpha}\,\,&\frac{1}{2} \alpha ^TQ \alpha + q^T \alpha \\ \mbox{s.t.}\,\,& A \alpha =0 \\ &0 \leq \alpha \leq u\\& \end{align}

where Q is symmetric and positive semidefinite, i.e., $Q \succeq 0$.

I am interested in solving its Lagrangian dual with an unconstrained optimizer, i.e., SGD; so I compute it as follow:

\begin{equation} \begin{aligned} \max_{\mu,\lambda_+,\lambda_-} \min_{\alpha} \mathcal{L}(\alpha,\mu,\lambda_+,\lambda_-) &= \frac{1}{2} \alpha^T Q\alpha+q^T\alpha - \mu^T (A^T \alpha) - \lambda_+^T (u - \alpha) - \lambda_-^T \alpha \\ &= \frac{1}{2} \alpha^T Q\alpha + (q - \mu A + \lambda_+ - \lambda_-)^T \alpha - \lambda_+^T u \\\mbox{s.t.}\,\,&\,\, \lambda_+, \lambda_- \geq 0 \end{aligned} \end{equation}

Taking the derivative of the Lagrangian $\mathcal{L}$ wrt $\alpha$ and settings it to 0 gives:

\begin{equation} \label{eq:svc_lagrangian_der_a} \frac{\partial \mathcal{L}}{\partial \alpha}=0\Rightarrow Q \alpha + (q - \mu A + \lambda_+ - \lambda_-) = 0 \end{equation}

With $\alpha$ optimal solution of the linear system:

\begin{equation} \label{eq:svc_lagrangian_sol} Q \alpha = - (q - \mu A + \lambda_+ - \lambda_-) \end{equation}

the gradient wrt $\mu$, $\lambda_+$ and $\lambda_-$ are:

\begin{equation} \label{eq:svc_lagrangian_der_mu} \frac{\partial \mathcal{L}}{\partial \mu}=-A \alpha \end{equation}

\begin{equation} \label{eq:svc_lagrangian_der_lp} \frac{\partial \mathcal{L}}{\partial \lambda_+}=\alpha - u \end{equation}

\begin{equation} \label{eq:svc_lagrangian_der_lm} \frac{\partial \mathcal{L}}{\partial \lambda_-}=-\alpha \end{equation}

So, I write the following code:

class LagrangianEqualityBoxConstrainedQuadratic():
    """
    Construct the lagrangian dual relaxation of an equality constrained
    quadratic function with lower and upper bounds defined as:

            1/2 x^T Q x + q^T x : A x = 0, lb = 0 <= x <= ub
    """

    def __init__(self, primal, A, ub):
        self.A = np.asarray(A, dtype=float)
        A = np.atleast_2d(A)
        # self.Z = null_space(A)  # more complex, uses SVD
        Q, R = np.linalg.qr(A.T, mode='complete')
        # null space aka kernel - range aka image
        self.Z = Q[:, A.shape[0]:]  # orthonormal basis for the null space of A, i.e., ker(A) = im(Q)
        assert np.allclose(self.A.dot(self.Z), 0)
        self.proj_Q = self.Z.T.dot(self.Q).dot(self.Z)  # project the null space on Q
        if any(u < 0 for u in ub):
            raise ValueError('the upper bound must be > 0')
        self.ub = np.asarray(ub, dtype=float)
        self.lb = np.zeros_like(ub)

    def _solve_sym_nonposdef(self, Q, q):
        # since Q is not strictly psd, i.e., the function is linear along the eigenvectors
        # correspondent to the null eigenvalues, the system has infinite solutions so the
        # Lagrangian is nondifferentiable, and, for each solution x, the Lagrangian will
        # have a different subgradient; so we will choose the one that minimizes the
        # 2-norm since it is good almost like the gradient

        # `min ||Qx - q||` is formally equivalent to solve the linear system:
        #                       (Q^T Q) x = (Q^T q)^T x

        Q, q = np.inner(Q, Q), Q.T.dot(q)

        x = scipy.sparse.linalg.minres(Q, q)[0]

        return x

    def function(self, lmbda):
        """
        Compute the value of the Lagrangian relaxation defined as:

        L(x, mu, lambda_+, lambda_-) = 1/2 x^T Q x + q^T x - mu^T A x - lambda_+^T (ub - x) - lambda_-^T x
        L(x, mu, lambda_+, lambda_-) = 1/2 x^T Q x + (q - mu A + lambda_+ - lambda_-)^T x - lambda_+^T ub

        where mu are the first n components of lambda which controls the equality constraints and
        lambda_+^T are the second n components of lambda and lambda_-^T are the last n components
        which controls the inequality constraint which are constrained to be >= 0.

        Taking the derivative of the Lagrangian wrt x and settings it to 0 gives:

                Q x + (q - mu A + lambda_+ - lambda_-) = 0

        so, the optimal solution of the Lagrangian relaxation is the solution of the linear system:

                Q x = - (q - mu A + lambda_+ - lambda_-)

        :param lmbda: the dual variable wrt evaluate the function
        :return: the function value wrt lambda
        """
        mu, lmbda_p, lmbda_n = np.split(lmbda, 3)
        ql = self.q - mu.dot(self.A) + lmbda_p - lmbda_n
        proj_ql = self.Z.T.dot(ql)  # project the null space on ql
        x = self._solve_sym_nonposdef(self.proj_Q, -proj_ql)
        x = self.Z.dot(x)  # recover the primal solution
        return 0.5 * x.dot(self.Q).dot(x) + ql.dot(x) - lmbda_p.dot(self.ub)

    def jacobian(self, lmbda):
        """
        With x optimal solution of the minimization problem, the jacobian
        of the Lagrangian dual relaxation at lambda is:

                                [-A x, x - ub, -x]

        However, we rather want to maximize the Lagrangian dual relaxation,
        hence we have to change the sign of gradient entries:

                                 [A x, ub - x, x]

        :param lmbda: the dual variable wrt evaluate the gradient
        :return: the gradient wrt lambda
        """
        mu, lmbda_p, lmbda_n = np.split(lmbda, 3)
        ql = self.q - mu.dot(self.A) + lmbda_p - lmbda_n
        proj_ql = self.Z.T.dot(ql)  # project the null space on ql
        x = self._solve_sym_nonposdef(self.proj_Q, -proj_ql)
        x = self.Z.dot(x)  # recover the primal solution
        return np.hstack((self.A * x, self.ub - x, x))

But it doesn't find the optimal solution. E.g., in two dimensions I have the following:

Q = [[60.84360046, 16.65254251],
     [16.65254251,  4.91124084]]
q = [-741.4880492 , -207.68364072]
A = [2, 7]

example