Consider a linear system $Ax=b$ where $A$ and $b$ are random matrices coming from a joint probability distribution $p$. For the sake of simplicity (and this is the case I am mostly interested in) consider the case where $A_{ij}\overset{\text{iid}}{\sim}\mathcal N(0,1)$ and $b_k\overset{\text{iid}}{\sim}\mathcal N(0,1)$ for all $i, j, k$.
Question: What is the best iterative linear solver for this "stochastic" linear system? (or a ranking of known solvers)
Here, the "best" is formalized in the following sense: an iterative solver will produce a sequence of iterates $\{x_t\}_{t=1:T}$. Given a solver, let $c(t)$ denote the cost associated with computing the first $t$ iterations (e.g. $\texttt{flops}$). We consider some error functional $\mathcal L(\{x_t\}, \{c_t\})$ that depends on the sequence and/or cost.
$$ \operatorname*{arg\,min}_{\text{solver}} \mathbf E_{A, b\sim p}[\mathcal L(\underbrace{\text{solver}(A,b)}_{=(\{x_t\}, \{c_t\})})] $$
Question: Are there any results / literature that analytically or empirically tackles this problem, or variants thereof? In particular,what is the best solver for random Gaussian systems?
Some concrete examples of error functionals that may be of interest:
- Cost of reaching a given error $\mathcal L=\min \big\{ c(t)\mid \|Ax_t-b\|<\text{tol.} \big\}$
- Error at a given budget: $\mathcal L=\min\big\{ \|Ax_t-b\| \mid c(t) <\text{budget} \big\} $
- Area under curve ($\leadsto$ preference for fast decreasing error) $\mathcal L = \sum_{t=0}^T \|Ax_t-b\| = \|\{r_t\}\|_{L^2}$ (possibly normalized by cost)
- Sobolev norm ($\leadsto$ preference for non-oscillating error) $\mathcal L=\|r\|^2_{L^2} + \|\dot r\|^2_{L^2}$ (resp. discretized version thereof)
The study of classic algorithms in random matrices is gaining popularity. In my non-expert view, there seem to be two goals:
Like a lot of random matrix theory, people tend to consider a limit where the size of the matrix goes to infinity. For the specific case of iterative linear solvers, Thomas Trogdon/Percy Deift and collaborators are a good starting point. Here's a few references to get started:
A lot of the results on CG/MINRES rely on the tridiagonal matrix models for certain types of random matrices cf. [DE02]. Another approach would be to show the algorithm only depends on the eigenvalue distribution which is one of the main topics of study for random matrix people.