Generalized Projection of a Matrix onto the Non Negative Orthant

758 Views Asked by At

I'd like to solve the following problem

$$\min_{Y \geq 0} \quad \operatorname{Tr} \left( (Y-Z)H(Y-Z)^T \right)$$

where $Z$ and $H$ are given matrices. $Y$ and $Z$ are $n \times r$ matrices, where $n$ may be large but $r$ is very small, and $H$ is $r \times r$ positive definite, but not sparse or structured. Assume I can do all the works on $H$ (Cholesky, eigendecomposition, etc)

This problem could be interpreted as a generalized projection, where the distance is somewhat warped by $H$ (but here $H$ is not the Hessian of some function, since it is tiny).

Is there any way of directly solving this problem (no iterative algorithms)? If not, is there a super fast iterative method I could use? I suppose the eigenvectors of $H$ should be helpful somehow.

An iterative method (also proposed below) would be to do a projected gradient algorithm, since the gradient is not that expensive and the projection is super cheap. But, because this is one subproblem in another iterative algorithm, I'd like to avoid anything with, say, more than $r$ steps. (Conj. gradient for example takes at most $r$ steps to solve an $r\times r$ linear system.)

Thanks for any tips!

2

There are 2 best solutions below

3
On

We have the following quadratic program in $\mathrm X \in \mathbb R^{n \times r}$

$$\min_{\mathrm X \geq \mathrm O} \quad \mbox{tr} \left( (\mathrm X - \mathrm A) \, \mathrm Q \, (\mathrm X - \mathrm A)^\top \right)$$

where $\mathrm Q \in \mathbb R^{r \times r}$ is positive definite and, thus, has a Cholesky decomposition of the form $\rm Q = R^\top R$.

$$\mbox{tr} \left( (\mathrm X - \mathrm A) \, \mathrm Q \, (\mathrm X - \mathrm A)^\top \right) = \mbox{tr} \left( (\mathrm X - \mathrm A) \,\mathrm R^\top \mathrm R \, (\mathrm X - \mathrm A)^\top \right) = \| (\mathrm X - \mathrm A) \,\mathrm R^\top \|_{\text{F}}^2$$

Vectorizing, we obtain

$$\| (\mathrm X - \mathrm A) \,\mathrm R^\top \|_{\text{F}}^2 = \cdots = \| \left(\mathrm R \otimes \mathrm I_n\right) \mbox{vec} (\mathrm X - \mathrm A) \|_2^2$$

which is a very standard convex quadratic function. Can you take it from here?

4
On

As written in comments and done by Rodrigo, using vectorization one could reform the problem into regular Constrained Least Squares Problem with inequality constraints.
Since the equation includes a Matrix a closed form solution isn't feasible directly.
One method to do so is utilizing the Projected Gradient Method.

Yet, since the the Gradient of the original problem is simple it would be better to use it directly:

$$ \frac{d}{ d Y } \operatorname{Tr} \left( \left( Y - Z \right) H \left( Y - Z \right)^{T} \right) = \left( Y - Z \right) \left( {H}^{T} + {H} \right) $$

The projection is simply clipping Negative Values.

A MATLAB Code for that is given by:

mY = zeros([numRows, numCols]);
vObjValPgd(1) = hObjFun(mY);

for ii = 2:numIterations
    % vG = ((mY - mZ) * mH.') + ((mY - mZ) * mH);
    vG = (mY - mZ) * (mH.' + mH);
    mY = mY - (stepSize * vG);

    mY = max(mY, 0); %<! Projection onto Non Negative Orthant

    vObjValPgd(ii) = hObjFun(mY); 
end

The result is verified against CVX:

enter image description here

Convergence is very fast (And can be accelerated using Accelerated Gradient Descent Methods) and operations are efficient.

The code is available (Including validation by CVX) at my StackExchange Mathematics Q2694373 GitHub Repository.