Triangular solver memory complexity

53 Views Asked by At

Does anybody know about the memory complexity of triangular solver?

PyTorch Documentation: https://pytorch.org/docs/stable/generated/torch.triangular_solve.html

More about Triangular solver: http://homepages.math.uic.edu/~jan/mcs572/pipetriangular.pdf

I'm trying to write some code for computation of kernel related gaussian processes and I'm facing some memory issues. Currently I'm not aware if the problem is because I am doing something wrong or if the triangular solver requires more than linear memory.

The pseudocode for my case is:

lambda t:f64[20000,20000] u:f64[20000,10]:
    v:f64[20000,10] = triangular_solve[
        conjugate_a=False
        left_side=True
        lower=False
        transpose_a=True
        unit_diagonal=False
    ] t u
    w:f64[20000,10] = triangular_solve[
        conjugate_a=False
        left_side=True
        lower=False
        transpose_a=False
        unit_diagonal=False
    ] t v
    v w
1

There are 1 best solutions below

0
On BEST ANSWER

A triangular linear solve can be performed in-place, as the coefficients of the triangular matrix are read off and used to perform row operations on the right-hand-side. In this way, the memory usage will only be about as large as the storage of the triangular matrix and right-hand-side, plus possibly some small additional memory needed for intermediate arithmetic at each row operation.

With that being said, are you performing the LU decomposition of a $20000 \times 20000$ matrix? And is that matrix dense? If you're using double precision, then just storing that matrix in memory costs $(20000)^2 \cdot 8 = 3.2 \times 10^9$ bytes, or 3 GB of memory. On modern machines this might not be that prohibitive, but if this is just one step out of many then that might explain your memory issues.