Approximation to unsolvable system of equations

2k Views Asked by At

I am working on a project and need to find the "closest" numerical values that satisfy the following equations:

\begin{equation} \left\{ \begin{array}{} A \cdot C = \frac{1}{2} \\ A \cdot D = \frac{5}{6} \\ B \cdot C = \frac{1}{8} \\ B \cdot D = \frac{1}{2} \end{array} \right. \end{equation}

Where $A$ and $B$ are under the constraint that they are non-negative integers and that $C$ and $D$ must be greater than zero but less than or equal to $1$.

From what I have tried so far, it seems no analytical solution exists. For my purposes an approximate solution will suffice (matches the left-hand side of the equation to several decimal points). Having obtained my BS in engineering, my guilty pleasure is Excel's Solver add-in. Using this tool to the best of my ability, I have yet to obtain a satisfactory approximate solution.

I am not asking anyone to "solve" this for me (though I wouldn't object), but would appreciate being pointed in the right direction. To summarize:

  • I need to find values for $A$, $B$, $C$, and $D$ which fall under the constraints mentioned above and provide the closest values to the actual values on the left-hand side of the above equations.
  • If the closest values are still unsatisfactory, is there someway of proving that they are indeed the "best" set of values (assuming the method used to find them does not inherently identify the "best" set).
  • If there is a matter of tradeoff in accuracy vs integer size, I would prefer the smallest value integers possible that still provide reasonably close solutions to the equations (several decimal points).
5

There are 5 best solutions below

4
On BEST ANSWER

Let

$$x_1 := \log_2 (a) \qquad \qquad x_2 := \log_2 (b) \qquad \qquad x_3 := \log_2 (c) \qquad \qquad x_4 := \log_2 (d)$$

Thus, the original system of bilinear equations can be transformed into a system of linear equations

$$\begin{bmatrix} 1 & 0 & 1 & 0\\ 1 & 0 & 0 & 1\\ 0 & 1 & 1 & 0\\ 0 & 1 & 0 & 1\end{bmatrix} \begin{bmatrix} x_1\\ x_2\\ x_3\\ x_4\end{bmatrix} = \begin{bmatrix} -1\\ \log_2 \left(\frac 5 6\right)\\ -3\\ -1\end{bmatrix}$$

Find the least-squares solution $(\hat x_1, \hat x_2, \hat x_3, \hat x_4)$ and then compute the corresponding $(\hat a,\hat b,\hat c,\hat d)$.


In MATLAB:

>> A = [1 0 1 0; 1 0 0 1; 0 1 1 0; 0 1 0 1]

A =

     1     0     1     0
     1     0     0     1
     0     1     1     0
     0     1     0     1

>> b = [-1; log2(5/6); -3; -1]

b =

   -1.0000
   -0.2630
   -3.0000
   -1.0000

Unfortunately, the matrix is singular:

>> det(A)

ans =

     0

We then find one solution to the "normal equations":

>> x = (A' * A)\(A' * b)

x =

    1.7194
    0.3509
   -3.0351
   -1.6667

The corresponding $(\hat a,\hat b,\hat c,\hat d)$ is, thus:

>> y = 2.^x

y =

    3.2930
    1.2754
    0.1220
    0.3150

We introduce a new matrix:

>> Y = y * y'

Y =

   10.8437    4.1997    0.4017    1.0372
    4.1997    1.6266    0.1556    0.4017
    0.4017    0.1556    0.0149    0.0384
    1.0372    0.4017    0.0384    0.0992

The absolute error is:

>> Y(1:2,3:4) - [1/2,5/6;1/8,1/2]

ans =

   -0.0983    0.2039
    0.0306   -0.0983

which is quite large.

9
On

Hint: If you stuff the variables into vectors, your problem becomes:

$$M = \left[\begin{array}{c}A\\B\end{array}\right]\left[\begin{array}{cc}C&D\end{array}\right] = \left[\begin{array}{rr} \frac{1}{2} & \frac 56\\\frac 1 8&\frac 1 2 \end{array}\right]$$

This means we want to find the best rank 1 match to the matrix.


Thanks for @Ian s comment which clarified nicely that you can use for example the Singular Value Decomposition ( SVD ) to find the best match.

The SVD says $M = U\Sigma V^*$ where the values in the diagonal $\Sigma$ are the singular values. If we pick the column in U and the row in $V^*$ which correspond to the largest singular value, we can identify the $A,B,C$ and $D$ above.


Some Matlab/Octave code to do the SVD approximation:

  M = [1/2,5/6;1/8,1/2];
  [U,S,V]=svd(M);
  U(:,1)*S(1,1)*V(:,1)'   % rank 1 SVD appr. as index 1 contains largest s.v.

$$\left[\begin{array}{cc} 0.445505865263938&0.861513345600555\\ 0.230380036476643&0.445505865263938 \end{array}\right]$$

abs(M-U(:,1)*S(1,1)*V(:,1)')  %and the absolute error

$$\left[\begin{array}{cc} 0.0544941347360618&0.028180012267222\\ 0.105380036476643&0.0544941347360618 \end{array}\right]$$

We can see the error distributed so that each element has a rather different error. It differs $10.5/2.8 \approx 3.74$ times smallest compared to largest. As mentioned in comments, if we are not OK with this distribution of the error, we may want to try minimize the error according to some other norm.

0
On

Another way you could try to approach this problem by using a least squares estimator.

Assume $$F(A,B,C,D)=(AC-1/2)^2+(AD-5/6)^2+(BC-1/5)^2+(BD-1/2)^2$$

Now calculate the gradient of $F$ $$\nabla F=\left[\dfrac{\partial F}{\partial A},\cdots,\dfrac{\partial F}{\partial D}\right]$$

Then set the gradient to zero. Solve the resulting nonlinear system using Newton-Raphson algorithm (you should also check the positiveness of the Hessian).

0
On

The OP's last bullet point led me to a different attack on this problem.

First, note that if $(a,b,c,d)$ is a suitable approximate solution, then so is $(k a, k b, c/k, d/k)$ for any positive integer $k$. The last bullet prefers smaller integers, so we will want $\mathrm{gcd}(a,b) = 1$.

Suppose we relax the equalities to memberships in intervals of a given radius: \begin{align} a c &\in [1/2 - \varepsilon, 1/2 + \varepsilon] = 1/2 + \varepsilon[-1,1]\\ a d &\in [5/6 - \varepsilon, 5/6 + \varepsilon] = 5/6 + \varepsilon[-1,1]\\ b c &\in [1/8 - \varepsilon, 1/8 + \varepsilon] = 1/8 + \varepsilon[-1,1]\\ b d &\in [1/2 - \varepsilon, 1/2 + \varepsilon] = 1/2 + \varepsilon[-1,1] \text{,} \end{align} where the first equality uses usual interval notation and the second form uses usual interval arithmetic notation.

Then we can control the number of digits of agreement by setting $\varepsilon$ to suitable negative powers of $10$.

This suggests a process:

  • Pick a $(c,d) \in (0,1] \times (0,1]$.
  • Find the smallest $\varepsilon$ so that there is still an integer point, $(a,b)$, satisfying the four membership relations.
  • Find the largest region on the $c$-$d$ plane where this $(a,b)$ has minimal epsilon of all integer points.

This allows us to partition the square $(0,1] \times (0,1]$ into regions where a particular integer point is optimal and search for the point in that region with minimal $\varepsilon$. In fact, once $a$ and $b$ are fixed, this is a search with linear constraints, so is pretty easy.

By computer algebra system (Mathematica 10.4.1), for $(c,d) = (1/6,1/2)$, the minimum $\varepsilon$ still admitting an integer point satisfying the membership relations is $\varepsilon = 1/6$ for $(a,b) = (2,1)$. For this $(a,b)$, the largest region in the $c$-$d$ plane where each point has minimal $\varepsilon$ at this integer point is $(c,d) \in [1/5,1/4]\times[1/3,5/9]$. In that region, the minimum value of $\varepsilon$ is $1/12$, attained at $(c,d) = (5/24, 5/12)$. This is not so impressive (since $\varepsilon = 1/12$ means that we only have a trifle more than one decimal agreement with the original equalities). (There is another minimum in this region, at $(c,d) = (5/24,7/16)$ yielding the same extreme value of $\varepsilon = 1/12$. However, there do not appear to be criteria for preferring points on the $c$-$d$ plane.)

(It's late/early here, so I'll continue hacking on this analytically some time in the next few days. Note to self: Use the $(ka, kb, c/k, d/k)$ relation to rewrite $c$ and $d$ ($\in [1,\infty)$) to give solutions $(ka, kb, kc, kd)$, then projectively set $b=1$ (or $a=1$, if there are also solutions with $b>a$).)

0
On

$$\begin{bmatrix} a\\ b\\ c\\ d\end{bmatrix} \begin{bmatrix} a\\ b\\ c\\ d\end{bmatrix}^\top = \begin{bmatrix} a^2 & a b & a c & a d\\ a b & b^2 & b c & b d\\ a c & b c & c^2 & c d\\ a d & b d & c d & d^2\end{bmatrix} = \begin{bmatrix} a^2 & a b & \frac 12 & \frac 56\\ a b & b^2 & \frac 18 & \frac 12\\ \frac 12 & \frac 18 & c^2 & c d\\ \frac 56 & \frac 12 & c d & d^2\end{bmatrix}$$

Let us look for a symmetric, positive semidefinite, rank-$1$ matrix $\mathrm X \in \mathbb R^{4 \times 4}$ whose northeast and southwest $2 \times 2$ blocks are equal to the northeast and southwest $2 \times 2$ blocks in the matrix above.

Hence, we have a rank-minimization problem

$$\begin{array}{ll} \text{minimize} & \mbox{rank} (\mathrm X) \\ \text{subject to} & x_{13} = \frac 12, \quad x_{14} = \frac 56\\ & x_{23} = \frac 18, \quad x_{24} = \frac 12\\ & \mathrm X = \mathrm X^\top\\ & \mathrm X \succeq \mathrm O_4\end{array}$$

Unfortunately, minimizing the rank is computationally hard [0]. Thus, let us minimize the nuclear norm of $\mathrm X$ instead. Since $\mathrm X$ is symmetric and positive semidefinite, its nuclear norm is given by

$$\|\mathrm X\|_* = \mbox{tr} (\mathrm X)$$

Hence, we obtain the following semidefinite program (SDP)

$$\begin{array}{ll} \text{minimize} & \mbox{tr} (\mathrm X) \\ \text{subject to} & x_{13} = \frac 12, \quad x_{14} = \frac 56\\ & x_{23} = \frac 18, \quad x_{24} = \frac 12\\ & \mathrm X = \mathrm X^\top\\ & \mathrm X \succeq \mathrm O_4\end{array}$$

Using MATLAB + CVX,

clear all; clc;

cvx_begin sdp
    variable X(4,4) symmetric
    minimize( trace(X) )
    subject to
        X(1,3)==1/2
        X(1,4)==5/6
        X(2,3)==1/8
        X(2,4)==1/2
        X >= 0
cvx_end

% rank of X
disp('rank(X) ='); disp(rank(X))

% build rank-1 matrix
[U,S,V] = svd(X);
X_tilde = U(:,1) * S(1,1) * (V(:,1)');
disp('rank-1 approximation of X ='); disp(X_tilde)

% approximation error
disp('approximation error = '); disp(X_tilde - X)

% error matrix
disp('error matrix = '); disp(X_tilde(1:2,3:4) - [1/2,5/6;1/8,1/2])

Running this MATLAB script, we obtain the following output

Calling sedumi: 10 variables, 4 equality constraints
------------------------------------------------------------
SeDuMi 1.21 by AdvOL, 2005-2008 and Jos F. Sturm, 1998-2003.
Alg = 2: xz-corrector, Adaptive Step-Differentiation, theta = 0.250, beta = 0.500
eqs m = 4, order n = 5, dim = 17, blocks = 2
nnz(A) = 4 + 0, nnz(ADA) = 16, nnz(L) = 10
 it :     b*y       gap    delta  rate   t/tP*  t/tD*   feas cg cg  prec
  0 :            9.09E+000 0.000
  1 :  2.28E+000 2.11E+000 0.000 0.2322 0.9000 0.9000   1.21  1  1  7.9E-001
  2 :  2.30E+000 4.47E-001 0.000 0.2117 0.9000 0.9000   1.18  1  1  1.8E-001
  3 :  2.45E+000 1.33E-002 0.000 0.0297 0.9900 0.9900   1.01  1  1  5.5E-003
  4 :  2.45E+000 3.56E-007 0.000 0.0000 1.0000 1.0000   1.00  1  1  1.4E-007
  5 :  2.45E+000 5.10E-014 0.000 0.0000 1.0000 1.0000   1.00  1  1  2.1E-014

iter seconds digits       c*x               b*y
  5      0.2  13.8  2.4509068616e+000  2.4509068616e+000
|Ax-b| =  4.5e-015, [Ay-c]_+ =  0.0E+000, |x|= 2.2e+000, |y|= 2.8e+000

Detailed timing (sec)
   Pre          IPM          Post
1.719E-001    2.344E-001    7.813E-002   
Max-norms: ||b||=8.333333e-001, ||c|| = 1,
Cholesky |add|=0, |skip| = 0, ||L.L|| = 1.41176.
------------------------------------------------------------
Status: Solved
Optimal value (cvx_optval): +2.45091
rank(X) =
     4

rank-1 approximation of X =
    0.8615    0.4455    0.4455    0.8615
    0.4455    0.2304    0.2304    0.4455
    0.4455    0.2304    0.2304    0.4455
    0.8615    0.4455    0.4455    0.8615

approximation error =
   -0.0282    0.0545   -0.0545    0.0282
    0.0545   -0.1054    0.1054   -0.0545
   -0.0545    0.1054   -0.1054    0.0545
    0.0282   -0.0545    0.0545   -0.0282

error matrix =
   -0.0545    0.0282
    0.1054   -0.0545

Note that the maximum error is smaller than when using least-squares.


[0] Emmanuel Candès, The Effectiveness of Convex Programming in the Information and Physical Sciences [PDF], Simons Institute Open Lecture, UC Berkeley, October 2013.