Optimizing rank computation for very large sparse matrices

101 Views Asked by At

I have a sparse matrix such as

A =

   (1,1)        1
   (3,1)        1
   (1,2)        1
   (2,2)        1
   (1,3)        1
   (3,3)        1
   (4,3)        1
   (4,4)        1

The full matrix of A can see look like as following:

full(A) =

     1     1     1     0
     0     1     0     0
     1     0     1     0
     0     0     1     1

I want to find the rank of matrix A by fast way(because my matrix can extend to 10000 x 20000). I try to do it by two ways but it give the different result

  1. Convert to full matrix and find rank using

    rank(full(A)) = 3
    
  2. Find the rank using sprank

    sprank(A) = 4
    

The true answer must be 3 that means using first way. However, it take long time to find the rank,especially matrix with large size. I know the reason why the second way give 4 because sprank only tells you how many rows/columns of your matrix have non-zero elements, while rank is reporting the actual rank of the matrix which indicates how many rows of your matrix are linearly independent. sprank(A) is 4 but rank(A) is only 3 because you can write the third row as a linear combination of the other rows, specifically A(2,:) - A(1,:).

My problem is that how to find the rank of a sparse matrix with lowest time consumption

Update: I tried to use some way. However, it reported larger time consumption comparison with rank function. Could you suggest to me other way?

%% Create random matrix
 G = sparse(randi(2,1000,1000))-1;
 A=sparse(G) %% Because my input matrix is sparse matrix
 %% Measure performance
>> tic; rank(full(A)); toc
Elapsed time is 0.710750 seconds.
>> tic; svds(A); toc
Elapsed time is 1.130674 seconds.
>> tic; eigs(A); toc
Warning: Only 3 of the 6 requested eigenvalues converged. 
> In eigs>processEUPDinfo at 1472
  In eigs at 365
Elapsed time is 4.894653 seconds.