Simple solutions to linear systems of equations

52 Views Asked by At

I am facing a problem which I think boils to down sparse decomposition. However, the solvers I am using don't seem to be producing sensible results and I'm wondering if I'm perhaps applying the wrong algorithm.

The problem can be expressed as follows: solve the tuples $(x_{ij}, y_{ij})$ for $1\leq i\leq n$ and $1\leq j\leq m$ where:

  • $x_{ij}$ and $y_{ij}$ are all non-negative integers
  • The ratio $\frac{x_{ij}}{y_{ij}}$ is pretty constant; maybe it varies between 0.5 and 1.5 or so
  • If $x_{ij}$ is zero then $y_{ij}$ is zero and vice-versa
  • For every $j$ we know $\mathbf{a}_j=\sum_{i=1}^n (x_{ij}, y_{ij})$
  • For every $i$ we know $\mathbf{b}_i=\sum_{j=1}^m (x_{ij}, y_{ij})$

The above problem isn't well-posed since we may have infinite possible solutions. What we want is to find the one that represents the "simplest" decomposition. For example, given the following

  • $\mathbf{a}_1 = (123, 100)$, $\mathbf{a}_2 = (123, 200)$, $\mathbf{a}_3 = (400, 300)$
  • $\mathbf{b}_1 = (123, 100)$, $\mathbf{b}_2 = (523, 500)$

You can see that $\mathbf{a}_1$ and $\mathbf{b}_1$ are equal. So if we set

$$ (x_{ij}, y_{ij}) = \begin{cases} (123, 100)&\text{for }i=1, j=1\\ (0, 0)&\text{otherwise} \end{cases} $$

Then we've solved those two. Then we can see that $\mathbf{a}_2+\mathbf{a}_3==\mathbf{b}_3$ so we can add a couple more clauses and we are done:

$$ (x_{ij}, y_{ij}) = \begin{cases} (123, 100)&\text{for }i=1, j=1\\ (123, 200)&\text{for }i=2, j=2\\ (400, 300)&\text{for }i=2, j=3\\ (0, 0)&\text{otherwise} \end{cases} $$

I have a couple of thousand cases of this problem to solve, where $n$ and $m$ range from 1 up to around 60. So not huge but it of course does get more complex than the above! The values of $\mathbf{a}_j$ and $\mathbf{b}_i$ are often six or eight digits long, and I have the feeling that there is a lot of information encoded in the digits that I can leverage to my advantage.

What I have done so far is to frame this problem as finding the sparsest solution. I've discovered algorithms like positive orthogonal matching pursuit and extensions to a multi-matching cases, and they do give me sparse solutions. However, they also miss really obvious matches like the $\mathbf{a}_1 == \mathbf{b}_1$ case above. I can filter such obvious solutions out before running my solver, but I'm wondering if I'm actually using the wrong algorithm. Matching pursuit looks to start with the waveform that gives the largest product with the signal, when really I care about the closest.

Any advice much appreciated!