I found a simple algorithm for simultaneous diagonalization of two commuting matrices (https://doi.org/10.48550/arXiv.2006.16364), which seemed to be well-founded. For commuting matrices $\mathbf{A}$ and $\mathbf{B}$, this algorithm forms the Jordan decomposition of
$\mathbf{A} = \mathbf{S}_\mathbf{A} \mathbf{D}_\mathbf{A} \mathbf{S}^{-1}_\mathbf{A}$,
then constructs
$\mathbf{T} = \mathbf{S}^{-1}_\mathbf{A} \mathbf{B} \mathbf{S}_\mathbf{A}$
and its Jordan decomposition,
$\mathbf{T} = \mathbf{S}_\mathbf{T} \mathbf{D}_\mathbf{T} \mathbf{S}^{-1}_\mathbf{T}$.
The ArXiv paper states that $\mathbf{U} = \mathbf{S}_\mathbf{A} \mathbf{S}_\mathbf{T}$ satisfies
$\mathbf{U}^{-1} \mathbf{A} \mathbf{U} = \mathbf{D}_\mathbf{A} $
and
$\mathbf{U}^{-1} \mathbf{B} \mathbf{U} = \mathbf{D}_\mathbf{B} $,
which would be favorable in terms of simultaneous diagonalization.
However, for specific A and B matrices (see below), I cannot retreive the $\mathbf{D}_\mathbf{A}$ matrix from the $\mathbf{U}^{-1} \mathbf{A} \mathbf{U}$ expression. Could you please help me what the problem would be with this algorithm? My Mathematica implementation is the following:
A = {{-0.488772, 0.572568, 1.00474, 0.576474, -0.883678}, {0.572568,
0.95183, -0.822118, 0.171161, -1.65205}, {1.00474, -0.822118,
0.307811, 1.62311, 1.51333}, {0.576474, 0.171161,
1.62311, -0.191142, -0.619896}, {-0.883678, -1.65205,
1.51333, -0.619896, -0.579727}};
B = {{1.99737, -0.417634, 0.510778, 0.174017, -1.21089}, {-0.417634,
2.79806, 0.0333479, 0.278912, -0.402774}, {0.510778, 0.0333479,
1.58929, 1.23511, 1.26844}, {0.174017, 0.278912, 1.23511,
1.44669, -0.539531}, {-1.21089, -0.402774, 1.26844, -0.539531,
1.16859}};
{SA, DA} = JordanDecomposition[A];
T = Inverse[SA] . B . SA;
{ST, DT} = JordanDecomposition[T];
U = SA . ST;
Chop[N[FullSimplify[Inverse[U] . A . U], 5] // MatrixForm]
The point where you deviate from the paper is when you compute an arbitrary Jordan decomposition of $T$ using
JordanDecomposition[T]. (You gloss over this in the text of the question by referring to “its Jordan decomposition” as if this were unique.) You leave it to Mathematica to choose the order of the blocks in the Jordan decomposition of $T$. The paper, by contrast, in Equation $(12)$, constructs a Jordan decomposition of $T$ by building it from Jordan decompositions of the blocks of $T$, thus ensuring that the resulting blocks are consistent with the blocks in the Jordan decomposition of $A$.