Efficiently solve a system of equations in C++ for only certain degrees of freedom given a known structure

576 Views Asked by At

I have an algorithm such that at some point I must solve the following system for $X_5$:

$$ \left( \begin{array}{cccccccccc} A_1& B_1& & C_1& & & & & \\ B_7& A_2& B_2& & C_2& & & & \\ & B_8& A_3& & & C_3& & & \\ C_7& & & A_4& B_3& & C_4& & \\ & C_8& & B_9& A_5& B_4& & C_5& \\ & & C_9& & B_{10}& A_6& & & C_6\\ & & & C_{10}& & & A_7& B_5& \\ & & & & C_{11}& & B_{11}& A_8& B_6\\ & & & & & C_{12}& & B_{12}& A_9\\ \end{array}\right) \left( \begin{array}{c} X_1\\ X_2\\ X_3\\ X_4\\ X_5\\ X_6\\ X_7\\ X_8\\ X_9\\ \end{array}\right) = \left( \begin{array}{c} R_1\\ R_2\\ R_3\\ R_4\\ R_5\\ R_6\\ R_7\\ R_8\\ R_9\\ \end{array}\right) $$

where $A_i$ is an invertible $N\times N$ matrix, $B_i$ and $C_i$ are general $N\times N$ matrices, and each $R_i$ is a vector of length $N$. $N$ is anywhere from $\mathcal O (100)$ to $\mathcal O(1000)$. The matrix as a whole is guaranteed to be invertible.

So far my strategy has been to perform standard Gaussian elimination, except stopping early during the backsolve step once $X_5$ has been solved for.

Another thing I tried is to ignore any algebra outside of the known bandwidth. This offers some speedup.

However, I am betting there is some clever trick to exploit here.

How can I efficiently solve this system numerically?

1

There are 1 best solutions below

7
On

As already mentioned by tch, naive Gaussian elimination has complexity $O(N^3)$, where $N$ is the number of rows in the large matrix. It does not take advantage of sparsity, as opposed to other methods. From personal experience with C++ I can recommend you

Eigen A collection of direct and iterative solvers, supports sparsity. Iterative solvers are generally much faster than direct solvers, but they require that the matrix is positive-definite and has decent condition number (although many of them have preconditioners to partially alleviate the problem)

SuperLU - A great solver if you can't guarantee positive definiteness of your matrix. It does LU decomposition followed by Gaussian Elimination. It makes good use of sparsity, and is extremely parallelizable (can use MPI), allowing to solve very large systems in a distributed and memory-efficient manner.

Another advice I can give you is to try to using Python before using C++. The advantage of using C++ is that you can really get cutting edge performance, and many numerical libraries are only available in C++. However, the disadvantage is that installing libraries and making sure they work the way you want is a bit of a headache and can take a few days if you are unlucky. On the other hand, while you can't find everything you want in Python, if you are lucky you are done in a matter of hours. So, I recommend you try the sparse linear solver from scipy first, maybe it works out of the box. Numerical solvers in python are written in C++, so there should not be any loss of performance.

Finally, if you really want to go hardcore and try to modify your matrix in such a way that it is possible to solve it faster, I recommend that you have a look at how preconditioners work. It is a science in itself. It is sometimes possible to find a very nice trick for a very special kind of matrix, but in general there is no silver bullet.

EDIT: Actually, I do have an idea for a preconditioner, but it has some assumptions about parts of your matrix being invertible. If I'm not mistaken, your large matrix can be rewritten as

$$M = \begin{pmatrix} M_{11} & M_{12} & 0\\ M_{21} & M_{22} & M_{23}\\ 0 & M_{32} & M_{33} \end{pmatrix}$$

Assuming that $M_{11}$ and $M_{33}$ are invertible, it should be possible to solve the 3x3 system first. If my system is

$$\begin{pmatrix} M_{11} & M_{12} & 0\\ M_{21} & M_{22} & M_{23}\\ 0 & M_{32} & M_{33} \end{pmatrix} \begin{pmatrix} Y_1 \\ Y_2 \\ Y_3 \end{pmatrix} = \begin{pmatrix} R_1 \\ R_2 \\ R_3 \end{pmatrix} $$

Then, I can do Gaussian Elimination by hand for $Y_2$

$$(M_{21}M_{11}^{-1}M_{12} + M_{22} + M_{23}M_{33}^{-1} M_{32})Y_2 = -M_{21}M_{11}^{-1}R_1 + R_2 - M_{23}M_{33}^{-1}R_3$$

Thus, you have a linear system for $Y_2$ which you can again solve by elimination or iteratively. Note that you need not necessarily perform matrix inversion in order to compute products like $M_{21}M_{11}^{-1}$. Many libraries offer to compute such products directly without having to compute the inverse.