conditioning of the monomial basis

947 Views Asked by At

It is well-known that the Vandermonde matrix is ill-conditioned when all nodes, i.e., the points that generate it, are real. I have read the following on a paper:

The ill- conditioning of the Vandermonde basis has an elementary explanation in cases where the points xj are unequal in size. If $|x_j |$ varies with j, then the powers $|(x_j)^k|$ vary exponentially. This means that function information associated with smaller values of $x_j$ will only be resolvable through exponentially large expansion coefficients, and accuracy will quickly be lost in floating point arithmetic

I have not understood the explanation because he terms "function information" and "expansion coefficients" are not defined in this context on the paper. Could somebody provide an explanation on why the Vandermonde matrix is ill-conditioned?.

2

There are 2 best solutions below

4
On BEST ANSWER

By "function information," I think they mean function evaluations $(y_j)_{0\leq j\leq J}$ and by "expansion coefficients" they mean $(c_i)_{0\leq i\leq k}$ below.

Here's the basic idea: let $(x_j)_{j\in\{0,1,\ldots,J\}}$ be your nodes, and for every $j\in\{0,1,\ldots, J\}$ let $y_j=f(x_j)$ be your function evaluations. Then expansion in the monomial basis, of rank up to $k$, would be attempting to determine the coefficients $(c_0,\ldots,c_k)$ by solving the system

$$ \begin{pmatrix} 1 & x_0 & \cdots & x_0^k\\ 1 & x_1 & \cdots & x_1^k\\ 1 & x_2 & \cdots & x_2^k\\ \vdots & & \ddots & \\ 1 & x_J & \cdots & x_J^k\\ \end{pmatrix} \begin{pmatrix} c_0 \\ c_1 \\ c_2\\\vdots\\c_k \end{pmatrix} = \begin{pmatrix} y_0 \\ y_1 \\ y_2\\\vdots\\y_J \end{pmatrix}. \tag{*} $$

The point the article is trying to make is that, if your nodes have drastically different magnitudes, then the matrix in ($*$) becomes horribly ill-conditioned, which (roughly speaking) means that if its input is changed a small amount, its output can be drastically different. They claim that due to this, your only hope (without changing your basis) is to increase $k$ in order to get better accuracy. But, due to the magnitude differences in the matrix (due to increasing $k$), you suffer from massive amounts of floating point roundoff error, so in general this is just a bad idea.

Here is a brief illustration using octave:

nodes = [0.1; 0.5; 0.0001; 10; 100; 1000];
max_order = 10;
cond_list = zeros(1,max_order);
for order = 1:max_order
  cond_list(order) = cond(vander(nodes,order));
end

$$ \begin{matrix} \textrm{Order}\;(k) & \textrm{Condition number of the matrix in }(*)\\ 1 & 1.0000e+00\\ 2 & 4.5975e+02 \\ 3 & 5.1510e+05 \\ 4 & 5.9490e+08 \\ 5 & 2.9047e+12 \\ 6 & 3.8906e+16 \\ 7 & 1.1060e+21 \\ 8 & 1.5428e+24 \\ 9 & 2.5684e+24 \\ 10 & \texttt{Inf} \end{matrix} $$

1
On

The Vandermonde matrices are not necessarily ill-conditioned when the nodes are real.

Example: In the case of $n=2$ nodes, the Vandermonde matrix is $$A = \begin{bmatrix} 1 & x_0 \\ 1 & x_1 \end{bmatrix}.$$ If $x_0 = -1$ and $x_1 = 1$, then columns are orthogonal and the 2-norm condition number of $A$ is given by $$\kappa_2(A) = \|A\|_2 \|A^{-1}\|_2 = 1.$$ This is the smallest possible value.

In order to understand exactly why the Vandermonde matrices can be arbitrarily ill-conditioned we investigate the singular values. Let $$\sigma_1 \ge \sigma_2 \dots \ge \sigma_n \ge 0$$ denote the singular values of $A$. Then $\{\sigma_i^2\}_{i=1}^n$ are the eigenvalues of $A^TA$ and $$\kappa_2(A) = \frac{\sigma_1}{\sigma_n}.$$ We can establish simple bounds on the singular values as follows. For a general symmetric matrix $B = [b_{ij}]$, the smallest eigenvalue satisfies $$ \lambda_{\min}(B) = \min \{ x^T B x \: : \: \|x\|_2 = 1 \} \leq b_{ii} $$ for all $i$ and the largest eigenvalue satisfies $$ \lambda_{\max}(B) = \max \{ x^T B x \: : \: \|x\|_2 = 1 \} \ge b_{ii} $$ for all $i$.

Example: In the case of $n=2$ nodes, we have $$ A^T A = \begin{bmatrix} 2 & x_0+x_1 \\ x_0 + x_1 & x_0^2 + x_1^2 \end{bmatrix}.$$ It follows that $$ \sigma_2^2 \leq 2, \quad x_0^2 + x_1^2 \leq \sigma_1^2$$ We conclude that $$\kappa_2(A) \ge \sqrt{ \frac{x_0^2 + x_1^2}{2} }.$$ The case of $x_1 = -x_0 = -x$ is particular clear. In this case, the relevant linear system is $$ \begin{bmatrix} 1 & x \\ 1 & -x \end{bmatrix} \begin{bmatrix} c_0 \\ c_1 \end{bmatrix} = \begin{bmatrix} y_0 \\ y_1 \end{bmatrix}. $$ The solution is $$ \begin{bmatrix} c_0 \\ c_1 \end{bmatrix} = \frac{1}{2x} \begin{bmatrix} x & x \\ 1 & -1\end{bmatrix} \begin{bmatrix} y_0 \\ y_1 \end{bmatrix} = \begin{bmatrix} \frac{y_0+y_1}{2} \\ \frac{y_1-y_0}{2x} \end{bmatrix}.$$ Here we note that while $c_0 = \frac{y_0+y_1}{2}$ is insensitive to small relative changes in the right-hand side when $y_1$ and $y_2$ are nearly equal, the divided difference $$c_1 = \frac{y_1 - y_0}{2x}$$ is extremely sensitive to small relative changes in the right-hand side when $y_1$ and $y_2$ are nearly equal. Specifically, a tiny relative change can flip the sign of $c_1$. The situation is reversed when $y_1 + y_2 \approx 0$. In both cases, we see that a small componentwise relative change in the right hand-side $y$, can cause a large componentwise relative change in the solution $c$.


The analysis presented here extends to the general case of $n>2$. Here we have $$\sigma_n^2 \leq n, \quad \sum_{j=0}^{n-1} (x_j^{n-1})^2 \leq \sigma_1^2$$ and $$ \kappa_2(A) \ge \sqrt{\frac{\sum_{j=0}^{n-1} (x_j^{n-1})^2}{n }}.$$