Let $$A=\begin{bmatrix}1 & 1 & 1\\ \epsilon & 0 & 0 \\ 0\ & \epsilon & 0 \\ 0 & 0 & \epsilon \end{bmatrix}.$$ On this page, this matrix $A$ is used to show the instability of the classical Gram-Schmidt algorithm, using the criterion that $1+\epsilon =1$. Furthermore, it can be shown that the output vectors from classical GS for $A$ are not orthogonal to each other.
It seems that many websites briefly seem to only talk about the algorithm's shortcomings when running it on a computer. Is there any more "general" reasoning as to why the classical GS algorithm doesn't always produce orthonormal vectors, even "on paper"?
Is it because classical GS (in this case) does not account well for the approximation $\epsilon+1=1$? Would someone be able to explain this a little more in depth?
Thanks
Classical and Modified Gram Schmidt are both unstable. If you read the text by Trefethen he described the difference between Householder and the first two as the following.
This is Classical and Modified Gram-Schmidt, described Triangular Orthogonalization $A \underbrace{R_{1} , R_{2} \cdots R_{n}}_{\hat{R}^{-1}} = \hat{Q} \tag{1}$
Below we see Householder, Orthogonal Triangularization
$ \underbrace{Q_{1} , Q_{2} \cdots Q_{n}}_{\hat{Q}^{*}}A = R \tag{2}$
Why are these different?
The condition number of a triangular matrix can be anything so if you have a series of them then it can grow very large however orthogonal matrices have condition number $1$.
By changing the $\epsilon$ you change the condition number. If you actually realize $\epsilon$ is related to the singular values. The first one is nearly $1$.
s
Due to orthogonalization it appears to be $\sqrt{3}$
Note that
$$ \kappa(A) = \frac{\sigma_{max}(A)}{\sigma_{min}(A)} = \frac{\sqrt{3}}{\epsilon} \tag{3}$$
Then you'd note that as $\epsilon \to 0$ $\kappa \to \infty$
Classical Gram Schmidt
The process of Gram Schmidt is the following for classical
$$ v_{j} = a_{j} - (q_{1}^{*}a_{j})q_{1} -(q_{2}^{*}a_{j})q_{2} - \cdots - (q_{j-1}^{*}a_{j})q_{j-1} \tag{3} $$
we can write this like this
$$ q_{1} = \frac{a_{1}}{r_{11}} \tag{4} $$
$$ q_{2} = \frac{a_{2} - r_{12}q_{1}}{r_{22}} \tag{5} $$
$$ q_{3} = \frac{ a_{3} - r_{13} q_{1}- r_{23}q_{2} }{r_{33}} \tag{6} $$ $$ q_{n} = \frac{a_{n} - \sum_{i=1}^{n-1} r_{in} q_{i} }{r_{nn} } \tag{7} $$
Now here is Modified Gram Schmidt. To begin we introduce orthogonal projections
Modified Gram Schmidt
$$ q_{1} = \frac{P_{1}a_{1}}{\| P_{1}a_{1}\|}, q_{2} = \frac{P_{2}a_{2}}{\| P_{2}a_{2}\|}, \cdots , q_{n} = \frac{P_{n}a_{n}}{\| P_{n}a_{n}\|} \tag{8}$$
More specifically $P_{j}$ is the an orthogonal projector. $P_{j}$ is the $m \times m$ matrix of rank $m -(j-1)$ that projects $\mathbb{C}^{m}$ onto the space to $\langle q_{1}, \cdots , q_{j-1} \rangle $
The projector $P_{j}$ can be represented explicitly. Here we represent $\hat{Q}_{j-1}$ as the $m \times (j-1)$ matrix containing the columns of the orhtogonal projection. I.e
$$ P_{j} = I - \hat{Q}_{j-1}\hat{Q}_{j-1}^{*} \tag{9}$$
then we get
$$ v_{j} = P_{j}a_{j} \tag{10} $$
So how is this more stable?
One more note
Your matrix is famous. It is called the Lauchli matrix
In Both CGS and MGS
where $ 1 + \epsilon^{2} =1$
$$v_{1} \to (1 , \epsilon, 0, 0) \tag{11} $$
$$ r_{11} = \sqrt{1 + \epsilon^{2} } \approx 1 \tag{12} $$
$$ q_{1} = \frac{v_{1}}{r_{11}} = (1 , \epsilon, 0, 0)\tag{13} $$ $$ v_{2} = (1,0,\epsilon,0) \tag{14} $$ $$ r_{12} = q_{1}^{T}a_{2} = q_{1}^{T}v_{2} = 1 \tag{15} $$ $$ v_{2} = v_{2} - r_{12}q_{1} = (0,-\epsilon, \epsilon,0) \tag{16} $$ $$ r_{22} = \sqrt{2}\epsilon \tag{17} $$ $$ q_{2} = (0,\frac{-1}{\sqrt{2}},\frac{1}{\sqrt{2}},0) \tag{18} $$
$$ v_{3} = (1,0,0,\epsilon) \tag{19} $$ $$ r_{13} = q_{1}^{t}v_{3} = 1 \tag{20} $$ $$ v_{3} = v_{3} - r_{13}q_{1} = (0,-\epsilon,0,\epsilon) \tag{21} $$
For CGS
$$ r_{23} = q_{2}^{T}a_{3} =0 \tag{22} $$ $$ v_{3} = v_{3} - r_{23}q_{2} = (0,-\epsilon,0,\epsilon) \tag{23} $$
$$ r_{33} = \sqrt{2} \epsilon \tag{24} $$ $$ q_{3} = \frac{v_{3}}{r_{33}} = (0,\frac{-1}{\sqrt{2}} ,0\frac{1}{\sqrt{2}} ) \tag{25} $$
For MGS
$$ r_{23} = q_{2}^{T}v_{3} =\frac{\epsilon}{\sqrt{2}} \tag{26} $$ $$ v_{3} = v_{3} - r_{23}q_{2} = (0,\frac{-\epsilon}{2},\frac{-\epsilon}{2}, \epsilon ) \tag{27} $$
$$ r_{33} = \frac{\sqrt{6}}{\epsilon 2} \tag{28} $$ $$ q_{3} = \frac{v_{3}}{r_{33}} = (0,\frac{-1}{\sqrt{6}} ,\frac{1}{\sqrt{6}},\frac{2}{\sqrt{6}} ) \tag{29} $$