I do need to numerically calculate the following forms for any $x\in\mathbb{R}^n$, possibly in python:
- $x^T M^k x$, where $M\in\mathbb{R^n}$ is a PSD matrix, where $k$ can get quite large values $-$possibly up to order of hundreds. I prefer $k$ to be a real number, but it is ok if it can only be an integer, if that makes a considerable accuracy difference.
- Similarly I am interested in $x^Te^{-tM}x$ where $t$ is a non-positive real value. For case 1 I can either:
- Use the scipy.linalg.fractional_matrix_power to calculate $M^k$ and then derive $x^TM^Kx$, or
- Use scipy.linalg.svd to find SVD of $M$ as $U\Lambda U^T$ and then finally evaluate the desired value using $x^TU\Lambda^k U^Tx$.
- Finally if $k$ is integer, and again based on SVD I can calculate $\|x^TM^{k/2}\|^2$.
For case 2
- Again I can use off the shelf scipy.linalg.expm and then exponentiate the singular values of $M$
- I can do SVD for $M$ and then go with $x^TU\exp(\Lambda) U^Tx$.
- Finally since I am only interested in $x^T \exp(M) x$, and not exactly $\exp(M)$ it self, I can consider the Taylor expansion of $x^T{\rm expm}(M)x\approx \sum_{i=0}^{l} \frac{1}{i!}x^TM^ix$ for some $l$ that controls the precision, and $x^TMx$ can be calculated based on case 1.
Can anybody guide me about what is the most precise way to calculate either of these expressions, up to hopefully machine precision? Any of these methods, or they're better solutions out there? I would be happy also with references.
P.S. Not knowing if here or stackoverflow being a good place to share this question at, I will be cross-posting it on their with the same title and content.
One standard approach to computing matrix functions times a vector $f(M)x$ or quadratic forms $x^Tf(M)x$ when $M$ is symmetric is via the Lanczos algorithm. Lanczos computes an orthonormal basis $Q_k = [q_1, \ldots, q_k]$ for Krylov subspace $\operatorname{span}(x,Ax,\ldots, A^{k-1}x)$ by a Gram-Schmidt like procedure. This results in a factorization $AQ_k =Q_kT_k + \beta_k q_{k+1}e_k^T$, where $e_k$ is the $k$-th canonical basis vector and $T_k$ is a $k\times k$ symmetric tridiagonal matrix.
The "Lanczos approximation" $f(M)x$ is defined as $Q_k f(T_k) Q_k^T x$, and the approximation to $x^Tf(M)x$ is defined as $x^TQ_k f(T_k) Q_k^T x$. There are software packages to compute $Q_k f(T_k) Q_k^T x$. This can easily be turned into the Lanczos approximation to $x^Tf(M)x$ by taking the inner product with $x$. The Lanczos approximation to the quadratic form can be viewed as a certain Gaussian quadrature approximation to $x^T f(M) x$ and converges extremely quickly in $k$ if $f$ is analytic (see Golub Meurant "Matrices Moments and Quadrature").
This approach has runtime $O(T_{mv}k + nk)$ (or $O(T_{mv}k + nk^2)$ with full reorthogonalization), where $T_{mv}$ is the cost of evaluating $v\mapsto Mv$. This will almost certainly be cheaper than an SVD or any $n^3$ algorithm since $k$ can typically be taken to be fairly small even for a machine precision accurate output.
Note that if $x$ is unit length $Q_k^Tx = e_1$. This means the Lanczos approximation to $x^Tf(M)x$ is given by $e_1^T f(T_k) e_1$ and that $Q_k$ doesn't need to be stored. There are also some subtleties about whether full reorthogonalization should be used or not. Whether or not it is used, the output will still be accurate, but without reorthogonalization $k$ may need to be larger. However, reorthogonalizaton is more computationally expensive, so there is some tradeoff.