Implement QCQP in CVXOPT

463 Views Asked by At

I'm struggling to formulate a simple QCQP in the correct format to solve with CVXOPT.

I'm trying to implement max-margin Inverse Reinforcement Learning from the paper Apprenticeship Learning via Inverse Reinforcement Learning (§3, p3), which is apparently the same as solving an SVM problem.

The optimisation problem in question is

QCQP Optimisation Problem

Where $t$ is a scalar and $\mu_E$, $\mu^{(j)}$ and $w$ are vectors of length $k$. My understanding from reading e.g. this answer and this question is that the euclidean norm inequality can be replaced with $w^Tw \leq 1$, making this a Quadratically Constrained Quadratic Program. What I don't know is how to formulate this in a form that is compatible with CVXOPT. Should I be using the Second Order Cone Programming method?

As an aside, I am able to formulate inequality (11), above, using the below code, it is only the euclidean norm part that has me stumped.

def add_optimal_expert_contraints(G, h):
    """Adds QP constraints to ensure the expert is optimal

    Assumes the 't' error term is first in the objective function,
    followed by the weight vector terms

    Args:
        G (numpy array): QP Vectorial inequality LHS constraint matrix
        h (numpy array): QP Vectorial inequality RHS vector

    Returns
        (numpy arrays): G and h, updated with new constraints
    """

    # Loop over our current set of less-than-expert policies
    for j in range(len(nonexpert_feature_expectations)):

        # For each policy, add one constraint that ensures the expert's
        # reward is greater than this policy's reward by at least a margin
        # of 't'
        G = np.vstack(
            (
                G,
                np.hstack(
                    (
                        1,
                        nonexpert_feature_expectations[j] \
                            - expert_feature_expectations
                    )
                )
            )
        )
        h = np.vstack((h, 0))

    return G, h

Thank you for your help!