Nonlinear optimization: Optimizing a matrix to make its square is close to a given matrix.

759 Views Asked by At

I'm trying to solve a minimization problem whose purpose is to optimize a matrix whose square is close to another given matrix. But I can't find an effective tool to solve it.

Here is my problem:

Assume we have an unknown Q with parameter $q11, q12,q14,q21,q22,q23,q32,q33,q34,q41,q43,q44$, and a given matrix G, that is, $Q=\begin{pmatrix} q11&q12 &0 &q14 \\q21&q22& q23&0\\ 0&q32& q33&q34\\ q41&0& q43&q44\\ \end{pmatrix} $, $G=\begin{pmatrix} 0.48&0.24 &0.16 &0.12 \\ 0.48&0.24 &0.16 &0.12\\0.48&0.24 &0.16 &0.12\\0.48&0.24 &0.16 &0.12 \end{pmatrix} $,

The problem is how to find the values of $q11, q12,q14,q21,q22,q23,q32,q33,q34,q41,q43,q44$ such that the square of $Q$ is very close to matrix $G$.

I choose to minimize the Frobenius norm of their difference, that is,

$ Q* ={\arg\min}_{Q} \| Q^2-G\|_F$

s.t. $0\leq q11, q12,q14,q21,q22,q23,q32,q33,q34,q41,q43,q44 \leq 1$,$\quad$
$\quad$ $q11+q12+q14=1$, $\quad$ $q21+q22+q23=1$, $\quad$ $q32+q33+q34=1$, $\quad$ $q41+q43+q44=1$.

During those days, I am frustrated to find a effective tool to execute the above optimization algorithm, can someone help me to realize it?

5

There are 5 best solutions below

2
On BEST ANSWER

For the modified question, let me try to give an answer that can address situations with matrices having the structure that you have.

Basically, your matrix $G$ has the following structure $$G=uv^T$$ where I have taken $u$ as the all $1$'s vector and $v$ a vector of positive coordinates such that $v^Tu=1$, i.e. $G$ is a stochastic matrix. It is also easy to see that $G$ is idempotent i.e. $G^2=G$.

You need to find out $Q$ which is a stochastic such that $\|Q^2-G\|_F$ is minimized. Obviously the only solution is $Q=G$. But if you do not want to take $Q=G$, then you cannot find an optimal solution. Instead you can choose a matrix $\delta$ having the property $\delta u=0$ and then can form the matrix $Q=G+\delta$, then you need to ensure that, for a chosen $\epsilon>0$, $$\|Q^2-G\|_F<\epsilon\implies \|G\delta+\delta G+\delta^2\|_F<\epsilon\\\implies \|uv^T\delta+\delta^2\|_F<\epsilon$$ since $\delta u=0$.

Edit: regarding how to calculate the matrix $\delta$, I am not sure it is always possible to get closed form solutions of $\delta$ and you have to use some numerical technique. Basically, for a given $\epsilon$, you have to calculate the Frobenius norm which will yield, $$Tr(N\delta^Tvv^T \delta+2\delta^T vu^T\delta^2+\delta^2(\delta^T)^2)<\epsilon$$ As I mentioned earlier, in general, this will ensure that the elements of $\delta$ remain inside a hyperellipse of degree $4$.

0
On

If $Q\ne \pm G$, then a minima cannot be achieved. You could try to write down $Q$ as $Q=G+\delta$ where $$\delta=\begin{pmatrix} \delta_1 & -\delta_1\\ \delta_2 &-\delta_2 \end{pmatrix}$$ where the delta's have to be chosen in a way such that they satisfy the constraints, i.e. $$-0.4\le \delta_1,\delta_2\le 0.6$$ Then any such feasible $\delta$ will give you a solution that can make $Q^2$ arbitrarily close to $G$. For example if you want to have $\|Q^2-G\|_F< \epsilon$, then $$\|G\delta+\delta G+\delta^2\|_F<\epsilon$$ This will make $\delta_1,\delta_2$ to be confined inside an hyperellipse of degree $4$, whose major and minor axis lengths will be controlled by $\epsilon$.

2
On

Nice answer, @Samrat Mukhopadhyay

I would like to add something that can bring a supplementary light to the issue.

It is based on a property that has been overlooked, i.e., that $rank(P)=1$.

$P=\begin{pmatrix} 1 \\ 1 \\ 1 \\ 1 \end{pmatrix} \begin{pmatrix} 0.48 & 0.24 & 0.16 & 0.12 \end{pmatrix}$

Said otherwise, the Singular Value Decomposition of $P$ is

$P=\sigma_1 U_1 V_1^T=1.14543 \begin{pmatrix} .5 \\ .5 \\ .5 \\ .5 \end{pmatrix} \begin{pmatrix} 0.8381 & 0.4191 & 0.2794 & 0.2095 \end{pmatrix}$

Approximants we are looking for can always be decomposed into the form

$$A=P+\sigma_2 U_2 V_2^T+\sigma_3 U_3 V_3^T+\sigma_4 U_4 V_4^T$$

The squared error to the target $\|A-P\|_F^2$ is thus $\sigma_2^2+\sigma_3^2+\sigma_4^2$ (F = Frobenius norm).

This is not at all independent of the ellipse Samrat Mukhopadhyai is speaking about.

More time and space would be necessary to detail all these aspects...

1
On

Minimizing $||Q^2-G||_F^2$ is equivalent to minimizing an SOS (sums of squares) polynomial whose variables are the non-zero entries of $Q$.

Since your constraints can also be written as polynomial constraints, this should be possible using the MOSEK.Polyopt.jl package for Julia. Here is a link to a powerpoint tutorial on how to use the package and an explanation of the technique used to solve these types of problems.

I haven't used this package, but I think it's based on techniques from the field of convex algebraic geometry, which adapts semidefinite programming to polynomial optimization.

Postscript: I see you tagged MATLAB, and it looks like there should be a similar package for MATLAB so if you don't have Julia you could check that one out instead

2
On

As you ask about MATLAB, here is an implementation and solution using YALMIP (disclaimer, developed by me).

First, set up data and model

G = repmat([.48 .24 .16 .12],4,1);
Q = sdpvar(4);
Q(1,3)=0;
Q(3,1)=0;
Q(2,4)=0;
Q(4,2)=0;

residual = Q*Q-G;
Objective = residual(:)'*residual(:);
Model = [0<=Q(:)<=1, sum(Q,2)==1];

Effectively, you have a nonconvex quartic optimization problem, so it is not easy to solve. The trivial approach is to simply throw a nonlinear local solver at it, such as fmincon or ipopt. fmincon solves the problem in under 0.05s, with optimal objective $0.349$

optimize(Model,Objective,sdpsettings('solver','fmincon'))
value(Q)
>> value(Q)

ans =

    0.3333    0.3333         0    0.3333
    0.3333    0.3333    0.3333         0
         0    0.3333    0.3333    0.3333
    0.3333         0    0.3333    0.3333

Alternatively, we can attack the problem using a global solver, such as scip, baron, or the built in global solver bmibnb (you should preferably have a fast LP solver installed for to be efficient, such as gurobi, mosek, cplex etc., although it works with MATLABs standard LP solver)

optimize(Model,Objective,sdpsettings('solver','bmibnb','bmibnb.maxiter',1e4))

The solver immediately finds the global solution (which is the same as fmincon found, so the local approach actually worked well), and all time is spent in proving optimality.

Performance can be improved significantly by sparsifying the objective

e = sdpvar(16,1);
Objective = e'*e;
Model = [e == residual(:), 0<=Q(:)<=1, sum(Q,2)==1];
optimize(Model,Objective,sdpsettings('solver','bmibnb'))

Finally, we can try a semidefinite relaxation (essentially what is refered to as a sum-of-squares approach above). Luckily, this problem can be solved using a semidefinite relaxation (i.e, the relaxation is tight, and a solution can be recovered). This turns out to be the fastest approach. Requires a semidefinite solver, such as Mosek, SDPT3, SeDuMi etc

residual = Q*Q-G;
Objective = residual(:)'*residual(:);
Model = [0<=Q(:)<=1, sum(Q,2)==1];
sol = optimize(Model,Objective,sdpsettings('solver','moment'))