I'm trying to solve:
\begin{align} \min_{\alpha}\,\,&\frac{1}{2} \alpha ^TQ \alpha + q^T \alpha \\ \mbox{s.t.}\,\,& A \alpha =0 \\ &0 \leq \alpha \leq u\\& \end{align}
where Q is symmetric and positive semidefinite, i.e., $Q \succeq 0$.
I am interested in solving its Lagrangian dual with an unconstrained optimizer, i.e., SGD; so I compute it as follow:
\begin{equation} \begin{aligned} \max_{\mu,\lambda_+,\lambda_-} \min_{\alpha} \mathcal{L}(\alpha,\mu,\lambda_+,\lambda_-) &= \frac{1}{2} \alpha^T Q\alpha+q^T\alpha - \mu^T (A^T \alpha) - \lambda_+^T (u - \alpha) - \lambda_-^T \alpha \\ &= \frac{1}{2} \alpha^T Q\alpha + (q - \mu A + \lambda_+ - \lambda_-)^T \alpha - \lambda_+^T u \\\mbox{s.t.}\,\,&\,\, \lambda_+, \lambda_- \geq 0 \end{aligned} \end{equation}
Taking the derivative of the Lagrangian $\mathcal{L}$ wrt $\alpha$ and settings it to 0 gives:
\begin{equation} \label{eq:svc_lagrangian_der_a} \frac{\partial \mathcal{L}}{\partial \alpha}=0\Rightarrow Q \alpha + (q - \mu A + \lambda_+ - \lambda_-) = 0 \end{equation}
With $\alpha$ optimal solution of the linear system:
\begin{equation} \label{eq:svc_lagrangian_sol} Q \alpha = - (q - \mu A + \lambda_+ - \lambda_-) \end{equation}
the gradient wrt $\mu$, $\lambda_+$ and $\lambda_-$ are:
\begin{equation} \label{eq:svc_lagrangian_der_mu} \frac{\partial \mathcal{L}}{\partial \mu}=-A \alpha \end{equation}
\begin{equation} \label{eq:svc_lagrangian_der_lp} \frac{\partial \mathcal{L}}{\partial \lambda_+}=\alpha - u \end{equation}
\begin{equation} \label{eq:svc_lagrangian_der_lm} \frac{\partial \mathcal{L}}{\partial \lambda_-}=-\alpha \end{equation}
So, I write the following code:
class LagrangianEqualityBoxConstrainedQuadratic():
"""
Construct the lagrangian dual relaxation of an equality constrained
quadratic function with lower and upper bounds defined as:
1/2 x^T Q x + q^T x : A x = 0, lb = 0 <= x <= ub
"""
def __init__(self, primal, A, ub):
self.A = np.asarray(A, dtype=float)
A = np.atleast_2d(A)
# self.Z = null_space(A) # more complex, uses SVD
Q, R = np.linalg.qr(A.T, mode='complete')
# null space aka kernel - range aka image
self.Z = Q[:, A.shape[0]:] # orthonormal basis for the null space of A, i.e., ker(A) = im(Q)
assert np.allclose(self.A.dot(self.Z), 0)
self.proj_Q = self.Z.T.dot(self.Q).dot(self.Z) # project the null space on Q
if any(u < 0 for u in ub):
raise ValueError('the upper bound must be > 0')
self.ub = np.asarray(ub, dtype=float)
self.lb = np.zeros_like(ub)
def _solve_sym_nonposdef(self, Q, q):
# since Q is not strictly psd, i.e., the function is linear along the eigenvectors
# correspondent to the null eigenvalues, the system has infinite solutions so the
# Lagrangian is nondifferentiable, and, for each solution x, the Lagrangian will
# have a different subgradient; so we will choose the one that minimizes the
# 2-norm since it is good almost like the gradient
# `min ||Qx - q||` is formally equivalent to solve the linear system:
# (Q^T Q) x = (Q^T q)^T x
Q, q = np.inner(Q, Q), Q.T.dot(q)
x = scipy.sparse.linalg.minres(Q, q)[0]
return x
def function(self, lmbda):
"""
Compute the value of the Lagrangian relaxation defined as:
L(x, mu, lambda_+, lambda_-) = 1/2 x^T Q x + q^T x - mu^T A x - lambda_+^T (ub - x) - lambda_-^T x
L(x, mu, lambda_+, lambda_-) = 1/2 x^T Q x + (q - mu A + lambda_+ - lambda_-)^T x - lambda_+^T ub
where mu are the first n components of lambda which controls the equality constraints and
lambda_+^T are the second n components of lambda and lambda_-^T are the last n components
which controls the inequality constraint which are constrained to be >= 0.
Taking the derivative of the Lagrangian wrt x and settings it to 0 gives:
Q x + (q - mu A + lambda_+ - lambda_-) = 0
so, the optimal solution of the Lagrangian relaxation is the solution of the linear system:
Q x = - (q - mu A + lambda_+ - lambda_-)
:param lmbda: the dual variable wrt evaluate the function
:return: the function value wrt lambda
"""
mu, lmbda_p, lmbda_n = np.split(lmbda, 3)
ql = self.q - mu.dot(self.A) + lmbda_p - lmbda_n
proj_ql = self.Z.T.dot(ql) # project the null space on ql
x = self._solve_sym_nonposdef(self.proj_Q, -proj_ql)
x = self.Z.dot(x) # recover the primal solution
return 0.5 * x.dot(self.Q).dot(x) + ql.dot(x) - lmbda_p.dot(self.ub)
def jacobian(self, lmbda):
"""
With x optimal solution of the minimization problem, the jacobian
of the Lagrangian dual relaxation at lambda is:
[-A x, x - ub, -x]
However, we rather want to maximize the Lagrangian dual relaxation,
hence we have to change the sign of gradient entries:
[A x, ub - x, x]
:param lmbda: the dual variable wrt evaluate the gradient
:return: the gradient wrt lambda
"""
mu, lmbda_p, lmbda_n = np.split(lmbda, 3)
ql = self.q - mu.dot(self.A) + lmbda_p - lmbda_n
proj_ql = self.Z.T.dot(ql) # project the null space on ql
x = self._solve_sym_nonposdef(self.proj_Q, -proj_ql)
x = self.Z.dot(x) # recover the primal solution
return np.hstack((self.A * x, self.ub - x, x))
But it doesn't find the optimal solution. E.g., in two dimensions I have the following:
Q = [[60.84360046, 16.65254251],
[16.65254251, 4.91124084]]
q = [-741.4880492 , -207.68364072]
A = [2, 7]