Computationally inexpensive comparison between two matrices

122 Views Asked by At

I have a symmetric $100\times100$ matrix, $A$ , with elements $A_{ij} < 100 $. I have to find whether this matrix is equal to or similar to another matrix $B$ of the same configuration. As these matrices are happening at a different point in time, I can't have an element by element comparison. I am looking to create something like base or invariant from the first matrix, store it, find the same base or invariant from the second matrix, and then compare whether they are equal or not.

I want to know if there is any way to do that in a computationally feasible way. As per Spectral theorem, is there a way to find invariant factors with less computation?

3

There are 3 best solutions below

3
On

As you have symmetric matrices with real entries, their eigenvalues are real numbers.

This does not mean you have to compute the eigenvalues. Your objective to use invariants such as trace, determinant, etc. is very reasonable.

I would even advise you to restrict your investigations at first to the trace $\sum_i A_{ii}$ ; it is very light, computionaly speaking (unlike the determinant) and work in this way:

  • Compute and store the trace of each matrix.

  • Once this has been done, look for pairs $(A,A')$ of matrices with the same trace (or maybe whose trace differ by less than a very small amount...),

  • compare now those matrix pairs :

    a) entrywise for the case $A=A'$.

    b) If $A \neq A'$, they can be similar ; I would advise then to provide a second filter by testing if $\det(A)=\det(A')$. If it is not the case, you can reject the pair, otherwise, now and only now, you can proceed to the computation of both spectra and test whether they are the same.

7
On

The identity of a symmetric matrix (up to similarity) is completely determined by its set of real eigenvalues. Therefore, the best you can do to detect a similar matrix with certainty is to compute and store all 100 eigenvalues $\lambda_1,...,\lambda_{100}$ (with multiplicities), e.g. in increasing order to simulate unordered tuples, and then compare these 100-vectors with each other instead of the 10,000 matrix entries.


Another idea

I just realized that there might be other sets of invariants too. Note that it always have to be 100 of them, though.

One idea I had is that you can compute the characteristic polynomial of $A$, normalize it and store its 100 non-trivial (e.g. non-leading) coefficients. This is (if you know how to do it) the effort of computing one determinant. Since the characteristic polynomial uniquely determines the eigenvalues, storing it is an alternative to computing them. You only have to compare the coefficients of these polynomials to compare the matrices.

This has several advantages and disadvantages compared to the eigenvalue-approach. One advantage is (depending on your use case) that if your matrix only has integer components, then these coefficients will be rational number, hence you do not need (rounding error prone) floating-point arithmetic. A disadvantage might be stability and the size of the resulting number. Since one of them is the determinant of your matrix, it will be huge (in the order of $10^{100}$). You will therefore need special data-types or you again use floating-point arithmetic with double (or more) precision.

There might be other sets of invariants which balance this effort of computation and stability. But I am not familiar with any of them. Note however, that the asymptotic complexity of computing a determinant is usually higher than that of computing eigenvalues. Whether it is better for your case, you have to find out yourself.

1
On

It's an ill-posed problem. If we know only an approximation of the $(A_{i,j})$, then (eventually) we can only say that $ A, B $ are almost similar. Then we assume that the $(A_{i,j})$ are exactly known.

In these conditions, (since the considered matrices are diagonalizable), at each step, just store the 100 coefficients of the characteristic polynomial of $A$.

There are randomized algorithms that do the job in $O(n^3)$ operations in $\mathbb{R}$. If the $(A_{i,j})$ are integers, there is an algorithm that does the job in essentially $O(n^{3.5}\log(||A||))$ bit operations (I don't write the terms in $\log(n)$ or $\log(\log(||A||))$).

When $A\in M_{100}(\mathbb{Z})$, the time of calculation with Maple (which is a very slow software) is $0"14$. About storing coefficients (at each step), $\approx 60$ amongst the $100$ coefficients are in the segment $[10^{100},10^{260}]$; that is, each storage contains $\approx 14000$ (decimal) digits.