Is there a faster way to calculate a few diagonal elements of the inverse of a huge symmetric positive definite matrix?

Question

Is there a faster way to calculate a few diagonal elements of the inverse of a huge symmetric positive definite matrix?

5k Views Asked by Bumbble Comm At 21 Apr 2026 - 8:59

I asked this on SO first, but decided to move the math part of my question here.

Consider a $p \times p$ symmetric and positive definite matrix $\bf A$ (where $p=70000$, i.e., $\bf A$ is roughly 40 GB using 8-byte doubles). We want to calculate the first $3$ diagonal elements of the inverse matrix, $({\bf A}^{-1})_{11}$, $({\bf A}^{-1})_{22}$ and $({\bf A}^{-1})_{33}$.

I have found this paper by James R. Bunch who seems to solve this exact problem without calculating the full inverse $\bf A^{-1}$. If I understand it correctly he first calculates the Cholesky decomposition, i.e. the upper triangular matrix $\bf R$ which satisfies $\bf A=R^T R$, which needs $\frac16p^2+\frac12p^2-\frac23p$ floating point operations (multiplications/divisions) using the LINPACK function SPOFA. He then proceeds to calculate individual diagonal elements of the inverse $({\bf A^{-1}})_{ii}$ using an expression which exploits the sparsity of ${\bf R}^T{\bf y}={\bf e}_j$ and which requires $\frac12(p-i)^2+\frac52(p-i)+2$ floating point operations. (I don't understand the full details of this, so I can't currently sum it up correctly).

The paper is based on LINPACK; it isn't cited by anyone, so it seems nobody cared for the last 23 years? After reading this, I'm wondering whether this is still the best way of doing things, or whether a modern LAPACK-based approach could avoid the Cholesky decomposition?

In short, is there a quicker way to calculate those diagonal elements of the inverse?

Original Q&A

There are 4 best solutions below

Bumbble Comm On 14 Sep 2011 - 8:00

I suggest you look at the book Matrices, Moments, and Quadrature with Applications. The basic idea is that you have a $n \times n$ positive semi-definite matrix $A$, and you wish to compute the quadratic form $u^T f(A)u$, where $u \in \mathbb{R}^n$. In your case you wish to compute $$ A^{-1}_{ii} = e_i^T f(A)e_i, $$ where $e_i \in \mathbb{R}^n$ is a vector of all zeros, with a 1 in the $i$th position, and the function $f$ is defined as $f(\lambda) = \lambda^{-1}$.

You can turn the quadratic form into an Stieltjes integral, since $$ u^T f(A)u = u^T Q^T f(\Lambda) Qu = \alpha^T f(\Lambda) \alpha = \sum_{i=1}^n f(\lambda_i) \alpha_i^2 = \int_a^b f(\lambda) d\alpha(\lambda), $$ where the eigenvalue decomposition of $A$ is given by $A = Q\Lambda Q^T$ and $\Lambda$ is a diagonal matrix containing the eigenvalues of $A$, and the vector $\alpha = Qu$. The integral can be estimated using Gaussian quadrature via $$ I[f] = \int_a^b f(\lambda) d \alpha(\lambda) = \sum_{i=1}^N \omega_i f(t_i) + R[f], $$ where $\{\omega_i\}_{i=1}^N$ are a set of weights, and $\{t_i\}_{i=1}^N$ are a set of nodes at which to evaluate the function $f$, and $R[f]$ is a remainder or error term. The values of the $\omega$'s and $t$'s are unknown and must be solved for. The values for the weights may be computed via an iterative algorithm similar to the Lanczos algorithm for computing eigenvalues. The values of the nodes may be obtained from components of an eigenvector of a matrix derived from $A$. This computation may be done efficiently. For more details see the book, as well as this lecture on the topic by James Lambers.

The underlying mathematics and linear algebra may seem a little scary at first, but I assure you this leads to a fairly simple and efficient algorithm. I wrote Matlab code to calculate the weights and nodes for a class in graduate school. It wasn't very difficult. Take a look at the book. Good luck.

Bumbble Comm On 15 Sep 2011 - 12:54

The first part here below is not an "answer" but an extended comment/followup-question to talonmies' previous answer/comment

I've extended your notation a bit and got a clearer exposition: First let us extend the X-matrix to be the inverse of the full R-matrix, such that $\small \mathbf A^{-1}= \mathbf B =\mathbf X^\tau \mathbf X $ and let us also extend the indexes for X , then
$\qquad \small \mathbf X_{00} =\mathbf R_{00}^{-1} $, $\small \qquad \mathbf X_{11} =\mathbf R_{11}^{-1} \quad $ but unfortunately $\small \mathbf X_{10} \ne \mathbf R_{10}^{-1} $ .

By the inverse-relation $\small \mathbf X =\mathbf R^{-1} $ we have, that $\small \mathbf X_{10} \mathbf R_{00}+ \mathbf X_{11} \mathbf R_{10} =\mathbf 0 $ and thus $\small \mathbf X_{10} = - \mathbf X_{11} \mathbf R_{10} \mathbf R_{00}^{-1} $ (from which we want to receive the top-left diagonal elements from B by $\small \mathbf X^\tau_{00} \mathbf X_{00} + \mathbf X^\tau_{10} \mathbf X_{10} = \mathbf B_{00} $ )

Here $\small \mathbf X_{11} $ represents the part of the problem for which we want to reduce the computational effort, because this is the inversion of the (huge) remaining part of the cholesky-factor.I thought to introduce the concept of the pseudoinverse pinv(X) by which we could proceed:

$\small \begin{array} {rcl} \mathbf Y &=& - ( \mathbf X_{11} \mathbf R_{10} \mathbf R_{00}^{-1} )^{-1} \\ &=& - \mathbf R_{00} \operatorname{pinv} (\mathbf R_{10}) \mathbf R_{11} \\ \mathbf X_{10} &=& \operatorname{pinv} ( \mathbf Y) \end{array}$

This would allow to invert small matrices only, but although the pseudoinverse-operations works well in many cases I could not make it work correctly for the pinv(R10)-part; for all variants of my computations I always needed the full version R11 in this or that version. Do you see any general reason, why the pinv(R10) is not working sufficiently here?

A practical computation

Hmm, as far as we accept that we need the full cholesky-decomposition anyway, the process might even be described shorter:

a) consider the symmetric SPD-matrix A , we look for the diagonal-elements of the 3x3-submatrix $\small \mathbf B_{00} $ of $ \small \mathbf B = \mathbf A^{-1} $

b) perform the cholesky-decomposition from bottom up into the matrix R; denote $\small \mathbf R_{00} $ the top left 3x3 upper triangular submatrix (thinking in terms of a correlation-matrix A this represents the partial or "unexplained" variance/covariance)

c) invert $\small \mathbf R_{00} $ to get $\small \mathbf X_{00} $ (this is also cheap because it's already triangular)

d) in $ \small \mathbf B_{00} = \mathbf X_{00}^\tau \mathbf X_{00} $ we find the required diagonal elements. (This last operation can even be replaced by a simple summing of squares along the columns in X )

Here is a complete example usable for Pari/GP. To have it as clear as possible no optimizations, errorchecks etc are done:

Crop(M,dim) = matrix(dim,dim,r,c,M[r,c])  \\ reduces size of a matrix

 \\ procedure to show the first dim diagonal-entries of  A^-1
{invdiag(A,dim=3)=local(rs=rows(A),lV,R,X);

 \\ reduce A by a cholesky-process to a dim x dim-residual matrix A_00
 \\ the cholesky-process goes bottom-up
 forstep( d = rs, dim+1, -1, 
           lV = A[,d]/sqrt(A[d,d]);
           A = Crop( A - lV*lV~ , d-1 );
         );
  \\after this A is the dim x dim residual-matrix 

  \\ compute the cholesky-factor of that A-residual into matrix R
  \\ we just proceed working bottom up, but A is no more cropped now
  R = matrix(dim,dim); 
  forstep( d = dim, 1, -1, 
    R[,d] = A[,d]/sqrt(A[d,d]);
    A = A - R[,d] * R[,d]~;
    );

   X = R^-1; \\ inverse of R of size dim x dim
             \\   and to extract values of diag(A^-1) =diag( R^-1~ * R^-1 )
             \\ it suffices to sum the squares along the columns
   lv = vector(dim,c,sum(r=1,c, X[r,c]^2) );
   return(lv);
 }

\\ create a sample matrix A
A = matrix(8,8,r,c,binomial(r-1,c-1)/2^(c-1))
A = A* A~  \\ make it a symmetric positive definite one

print(invdiag(A,3)) \\ show the first 3 results by the partial cholesky-method
print(diag(A^-1))   \\ show the true results by inversion of the full matrix

\\ output:
 [21845.0000000, 980612.000000, 8259152.00000]
 [21845, 980612, 8259152, 21815360, 21017856, 7373824, 806912, 16384]~

Bumbble Comm On 02 Jan 2021 - 10:59

To get the first column of $A^{-1}$, solve $A \, X_1 \, = \, e_1 \, = \, [1 0 0 ...]$;
to get the first two, solve $\qquad \qquad A \, [X_1 X_2] \, = \, [e_1 e_2]$;
and you have $X_{11} \, X_{22} ...\ $ Or am I missing something ?

(There's a wide range of solvers for various sparse matrices, direct and iterative; they vary a lot in robustness, number of papers, etc. For benchmarking, sparsesolvers.py on gist.github is a uniform wrapper around a dozen or so solvers in scipy.)

**Bumbble Comm** · Accepted Answer

I suggested that a partial Cholesky decomposition should be possible here, but based on a comment I received, I started thinking about my answer again and realised that my suggestion was incorrect. Apologies if I lead anyone astray.

You will need to use a full Cholesky decomposition, but the problem to deduce the inverse can be reduced in scale to save redundant computation.

If your SPD matrix $\mathbf{A}$ is written in block form as

$\mathbf{A}=\begin{bmatrix} \mathbf{A}_{00} & \mathbf{A}_{10}^T \\ \mathbf{A}_{10} & \mathbf{A}_{11} \end{bmatrix}$

with $\mathbf{A}_{00}$ being a $3\times3$ SPD block, its inverse will be given by

$\mathbf{A}^{-1}=\begin{bmatrix} \mathbf{B}_{00} & \mathbf{B}_{10}^T \\ \mathbf{B}_{10} & \mathbf{B}_{11} \end{bmatrix}$

The equivalent Cholesky decomposition of $\mathbf{A}$ is then given by

$\mathbf{R}=\begin{bmatrix} \mathbf{R}_{00} & \\ \mathbf{R}_{10} & \mathbf{R}_{11} \end{bmatrix}$

The resulting matrix equation to solve for the inverse block matrix $\mathbf{B}_{00}$ can then be reduced to

$\begin{bmatrix} \mathbf{R}_{00} & \\ \mathbf{R}_{10} & \mathbf{R}_{11} \end{bmatrix} \begin{bmatrix} \mathbf{R}_{00}^T & \mathbf{R}_{10}^T\\ & \mathbf{R}_{11}^T \end{bmatrix} \begin{bmatrix} \mathbf{B}_{00} \\ \mathbf{B}_{10} \end{bmatrix}=\begin{bmatrix}\mathbf{I}_{00} \\ \mathbf{0} \end{bmatrix}$

and solved via

$\begin{bmatrix} \mathbf{R}_{00} & \\ \mathbf{R}_{10} & \mathbf{R}_{11} \end{bmatrix} \begin{bmatrix} \mathbf{X}_{0} \\ \mathbf{X}_{1} \end{bmatrix} =\begin{bmatrix}\mathbf{I}_{00} \\ \mathbf{0} \end{bmatrix}$

and

$\begin{bmatrix} \mathbf{R}_{00}^T & \mathbf{R}_{10}^T\\ & \mathbf{R}_{11}^T \end{bmatrix} \begin{bmatrix} \mathbf{B}_{00} \\ \mathbf{B}_{10} \end{bmatrix}=\begin{bmatrix}\mathbf{X}_{0} \\ \mathbf{X}_{1} \end{bmatrix}$

The solution you are looking for will be the diagonal entries of $\mathbf{B}_{00}$. The right hand side of this expression is only $n\times 3$, and this combined with lower triangular structure of $\mathbf{R}$ should be about the most efficient way to get the solution you require.

Is there a faster way to calculate a few diagonal elements of the inverse of a huge symmetric positive definite matrix?

There are 4 best solutions below

Related Questions in MATRICES

Related Questions in INVERSE

Related Questions in NUMERICAL-LINEAR-ALGEBRA

Related Questions in SYMMETRIC-MATRICES

Related Questions in POSITIVE-DEFINITE

Trending Questions

Popular # Hahtags

Popular Questions