I am trying to implement QR algorithm. Despite its simplicity, I am stuck for long time as the results do not match with Python reference (np.linalg.eig).
I have a matrix. For example, this one:
mat = array([[ 1.00003037e+00, -7.17939231e-05, 4.24546081e-05],
[-7.17939231e-05, 1.00004127e+00, 5.89296817e-05],
[ 4.24546081e-05, 5.89296817e-05, 1.00004161e+00]])
I want to obtain eigenvalues and eigenvectors of mat.T @ mat (without utilizing numpy or pre-existing libraries). As the first simplest (as I thought...) step I want to try QR algorithm:
MAX_N_ITERATIONS = 10_000 # just example
vectors = np.eye(3)
for k in range(MAX_N_ITERATIONS):
q, r = qr(a) # my own implementation of QR
a = r @ q
# for symmetric matrix we are approaching diagonal
# -> eigenvectors are just columns of this matrix
vectors = vectors @ q
Problem is, the more iterations I make, the more I diverge from the reference. For a single iteration, i.e. for a single QR decomposition, my implementation of QR gives similar result with a tolerance of about 1e-5 or 1e-6. Of course, as there are thousands of iterations, these errors stack, and the resulting eigenvectors are far from the reference.
This is how np.linalg.normof error grows as the iteration number increases. Here, the error is the difference between the eigenvectors using my QR implementation and using reference Python (LAPACK) implementation. As you can see, there is no single big spike that would suggest maybe some specific bug or wrong calculation. It is just accumulation of errors. Note that in both cases (reference and my implementation) I am using double precision.
So, my questions are:
- What is actually precision of LAPACK's QR? Is tolerance of 1e-5..1e-7 for R/Q matrices compared with the reference implementation good for QR? Or is it already indicating some floating-point problems?
- Is this expected for this algorithm? Is it too simple to achieve high precision? I thought it is just considered very slow. Is this error accumulation behavior (to such a significant degree) expected? Do I have to just implement shifted version with supposedly much faster convergence, which would also reduce the amount of steps throughout the errors accumulate?
