Best Linear Solvers for Random Matrices

136 Views Asked by At

Consider a linear system $Ax=b$ where $A$ and $b$ are random matrices coming from a joint probability distribution $p$. For the sake of simplicity (and this is the case I am mostly interested in) consider the case where $A_{ij}\overset{\text{iid}}{\sim}\mathcal N(0,1)$ and $b_k\overset{\text{iid}}{\sim}\mathcal N(0,1)$ for all $i, j, k$.

Question: What is the best iterative linear solver for this "stochastic" linear system? (or a ranking of known solvers)

Here, the "best" is formalized in the following sense: an iterative solver will produce a sequence of iterates $\{x_t\}_{t=1:T}$. Given a solver, let $c(t)$ denote the cost associated with computing the first $t$ iterations (e.g. $\texttt{flops}$). We consider some error functional $\mathcal L(\{x_t\}, \{c_t\})$ that depends on the sequence and/or cost.

$$ \operatorname*{arg\,min}_{\text{solver}} \mathbf E_{A, b\sim p}[\mathcal L(\underbrace{\text{solver}(A,b)}_{=(\{x_t\}, \{c_t\})})] $$

Question: Are there any results / literature that analytically or empirically tackles this problem, or variants thereof? In particular,what is the best solver for random Gaussian systems?

Some concrete examples of error functionals that may be of interest:

  • Cost of reaching a given error $\mathcal L=\min \big\{ c(t)\mid \|Ax_t-b\|<\text{tol.} \big\}$
  • Error at a given budget: $\mathcal L=\min\big\{ \|Ax_t-b\| \mid c(t) <\text{budget} \big\} $
  • Area under curve ($\leadsto$ preference for fast decreasing error) $\mathcal L = \sum_{t=0}^T \|Ax_t-b\| = \|\{r_t\}\|_{L^2}$ (possibly normalized by cost)
  • Sobolev norm ($\leadsto$ preference for non-oscillating error) $\mathcal L=\|r\|^2_{L^2} + \|\dot r\|^2_{L^2}$ (resp. discretized version thereof)
1

There are 1 best solutions below

1
On

The study of classic algorithms in random matrices is gaining popularity. In my non-expert view, there seem to be two goals:

  1. Find the distribution of some quantity at some iteration. For instance, in your setup you may consider the distribution of the residual at step $t$.
  2. Prove universality: i.e. show that as long as the entries of the random matrix satisfy some basic conditions, it doesn't matter what distribution they come from.

Like a lot of random matrix theory, people tend to consider a limit where the size of the matrix goes to infinity. For the specific case of iterative linear solvers, Thomas Trogdon/Percy Deift and collaborators are a good starting point. Here's a few references to get started:

  1. Universality in numerical computations with random data: a collection of experiments of some standard algorithms on random matrices
  2. The conjugate gradient algorithm on well-conditioned Wishart matrices is almost deterministic: the number of iterations required by the conjugate gradient algorithm to reach a specified error converges to a known quantity. This known quantity essentially corresponds to interpolating $1/x$ at the roots of the orthogonal polynomials of the limiting eigenvalue density.
  3. Universality for the conjugate gradient and MINRES algorithms on sample covariance matrices
  4. A Probabilistic Analysis of the Neumann Series Iteration

A lot of the results on CG/MINRES rely on the tridiagonal matrix models for certain types of random matrices cf. [DE02]. Another approach would be to show the algorithm only depends on the eigenvalue distribution which is one of the main topics of study for random matrix people.