Solve $A^{-1} B A = C$ for orthogonal matrices

685 Views Asked by At

I need to solve $A^{-1} B A = C$ for $A$. All matrices are affine 3D transforms (represented as homogeneous matrices) with an orthogonal linear part. I.e. they are 4 x 4 matrices, the last row is $(0, 0, 0, 1)$ and the upper left 3 x 3 block is orthogonal. In case there is no orthogonal matrix $A$ that satisfies this equation, I want the orthogonal matrix that yields the smallest error $(A^{-1} B A - C)^2$.

I tried to solve this without the orthogonality constraint and enforcing it afterwards. For that, I solved a linear system (similar to this answer). Then I projected the linear part of the solution back into the space of orthogonal matrices by $M'=UV^T$, where $U$ and $V$ are the according factors of the solution $M$'s singular value decomposition.

However, the solution I get is far from what I want. The intermediate solution $M$ actually fulfills the initial equation but it is not near to orthogonal. Especially, the translation part is all zero, which does not make sense.

So basically, I am looking for a way to solve the equation that utilizes the orthogonality properties of all involved matrices. I know that I can represent the matrices differently (e.g. as Euler angles and a translation vector) and solve for those. But this becomes non-linear and cannot be solved directly.

Btw, I will have multiple of these equations in the form

$$ A^{-1}B_1 A=C_1\\ A^{-1}B_2 A=C_2\\ A^{-1}B_3 A=C_3\\ ... $$ And I want to solve these in a least-squares sense. But this will probably be easy once I have a reliable way to solve a single equation.

Update

Here is an example:

$$ B= \begin{bmatrix} 0.975509 & -0.0993911 & -0.196226 & -85.0091 \\ 0.0976203 & 0.995048 & -0.0187001 & 34.7605 \\ 0.197113 & -0.000913531 & 0.98038 & 93.3659 \\ 0 & 0 & 0 & 1 \end{bmatrix}\\ C= \begin{bmatrix} 0.973421 & 0.0742983 & -0.216638 & -6.63937 \\ -0.0674149 & 0.996963 & 0.0389824 & -21.1132 \\ 0.218876 & 0.0233417 & 0.975473 & 49.6126\\ 0 & 0 & 0 & 1 \end{bmatrix} $$ I know that the solution must be close to the following matrix: $$ \begin{bmatrix} 0.871881 & 0.0524241 & -0.486903 & -258.214\\ 0.328096 & 0.67559 & 0.660251 & -201.229 \\ 0.36356 & -0.735412 & 0.571834 & -143.373 \\ \end{bmatrix} $$ (This is actually the matrix that I want to correct slightly in my application)

To solve for $A$ with the notation of the Kronecker product, I got the following system:

$$ \begin{bmatrix} -0.0020879999999999788 & 0.0993911 & 0.196226 & -0.0674149 & 0. & 0. & 0.218876 & 0. & 0. & 0. & 0. & 0.\\-0.0976203 & -0.021627000000000063 & 0.0187001 & 0. & -0.0674149 & 0. & 0. & 0.218876 & 0. & 0. & 0. & 0.\\-0.197113 & 0.000913531 & -0.0069590000000000485 & 0. & 0. & -0.0674149 & 0. & 0. & 0.218876 & 0. & 0. & 0.\\0.0742983 & 0. & 0. & 0.021454000000000084 & 0.0993911 & 0.196226 & 0.0233417 & 0. & 0. & 0. & 0. & 0.\\0. & 0.0742983 & 0. & -0.0976203 & 0.001915 & 0.0187001 & 0. & 0.0233417 & 0. & 0. & 0. & 0.\\0. & 0. & 0.0742983 & -0.197113 & 0.000913531 & 0.016583000000000014 & 0. & 0. & 0.0233417 & 0. & 0. & 0.\\-0.216638 & 0. & 0. & 0.0389824 & 0. & 0. & -0.00003599999999992498 & 0.0993911 & 0.196226 & 0. & 0. & 0.\\0. & -0.216638 & 0. & 0. & 0.0389824 & 0. & -0.0976203 & -0.01957500000000001 & 0.0187001 & 0. & 0. & 0.\\0. & 0. & -0.216638 & 0. & 0. & 0.0389824 & -0.197113 & 0.000913531 & -0.004906999999999995 & 0. & 0. & 0.\\-6.63937 & 0. & 0. & -21.1132 & 0. & 0. & 49.6126 & 0. & 0. & 0.02449100000000004 & 0.0993911 & 0.196226\\0. & -6.63937 & 0. & 0. & -21.1132 & 0. & 0. & 49.6126 & 0. & -0.0976203 & 0.0049519999999999564 & 0.0187001\\0. & 0. & -6.63937 & 0. & 0. & -21.1132 & 0. & 0. & 49.6126 & -0.197113 & 0.000913531 & 0.01961999999999997 \end{bmatrix} vec(A_{12})= \begin{bmatrix} 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ -85.0091 \\ 34.7605 \\ 93.3659 \end{bmatrix} $$ $vec(A_{12})$ is the vector of the entries of $A$ (columns stacked on top of each other) with the last row removed.

However, this system has no solution - probably due to numerical issues. The least-squares solution is:

$$ \begin{bmatrix} 2.38109*10^{-8} & -3.49507*10^{-9} & 8.80459*10^{-9} & 3.08121*10^6 \\ -6.91407*10^{-9} & -2.75114*10^{-8} & 2.16676*10^{-10} & -6.8149*10^7 \\ -7.46204*10^{-9} & 5.65462*10^{-9} & 2.58108*10^{-8} & 3.41334*10^7 \\ 0 & 0 & 0 & 1 \end{bmatrix} $$ This is obviously not even close to being orthogonal.

2

There are 2 best solutions below

2
On BEST ANSWER

Solving the linear equations $A C = BA$ and $A_{41}=0, \ldots, A_{44} = 1$. You may get a unique solution (in which case you check that the top left $3 \times 3$ matrix is orthogonal), or a parametric solution with some arbitrary variables (in which case you check what values the parameters must have to make the top left $3 \times 3$ matrix orthogonal). The latter will require solving some quadratic equations in the parameters: easy if there is just one parameter, more complicated if there are several.

EDIT: In your example, using Maple's LSSolve function with your given approximate solution as initial point, what I get as the best least-squares solution for the linear equations, using the quadratics as constraints, is

$$ \left[ \begin {array}{cccc} 0.922106169050555202& 0.0259062112679868568&- 0.386068752889484001&- 259.015722238283104 \\ 0.240075198859380001& 0.744181318602853081& 0.623344258082939584&- 197.239155098949084\\ 0.303453641653002204&- 0.667475118124205569& 0.679994746043087850&- 169.389845060954997\\ 0&0&0&1\end {array} \right] $$

It makes $$AC - BA = \left[ \begin {array}{cccc} - 0.00476617429607195664& 0.00304423895353944651&- 0.00335193095893870918&- 0.00000404358121386394487\\ - 0.00326758507934044040& 0.0188013774691542235& 0.0152027162800097981& 0.0000894215323796743178\\ 0.0101807673199222348& 0.0229229648659853291&- 0.0184274918922858744&- 0.0000447884023486722072\\ 0.0& 0.0& 0.0& 0.0 \end {array} \right] $$

0
On

Maybe, in the case of rotation with translation (Euclidean congruent transformation preserving orientation), one can try to solve such a problem geometrically. I think the following algorithm may provide a solution:

$$B \, y = \hat{B} \, y + b$$ $$C \, z = \hat{C} \, z + c$$ where $\hat{B}, \, \hat{C} \, \in \, \text{SO}(3)$ are orthogonal rotation 3D matrices and $b, \, c \, \in \, \mathbb{R}^3$ are the translation vectors.

Step 1. Find the unit vector $\hat{\omega}_B \, \in \, \mathbb{R}^3$ along the axis of rotation of $\hat{B}$, i.e. $|\hat{\omega}_B| = 1$ and $$\hat{B} \, \hat{\omega}_B = \hat{\omega}_B$$

Step 2. Find the unit vector $\hat{\omega}_C \, \in \, \mathbb{R}^3$ along the axis of rotation of $\hat{C}$, i.e. $|\hat{\omega}_C| = 1$ and $$\hat{C} \, \hat{\omega}_C = \hat{\omega}_C$$

Step 3. Find one point $p_B \, \in \, \mathbb{R}^3$ (i.e. find one non-trivial solution to the linear system of rank 2) such that $$\hat{\omega}_B \times \big(\,(\hat{B} - I_3)\,p_B\, +\, b\,\big) = 0$$

Step 4. Find one point $p_C \, \in \, \mathbb{R}^3$ (i.e. find one non-trivial solution to the linear system of rank 2) such that $$\hat{\omega}_C \times \big(\,(\hat{C} - I_3)\,p_C\, +\, c\,\big) = 0$$

Step 5. Choose any unit vector $\hat{u}_B \, \in \, \mathbb{R}^3$ such that $$|\hat{u}_B| = 1 \,\,\,\text{ and } \,\,\, \hat{u}_B \cdot \hat{\omega}_B = 0$$ i.e. $\hat{u}_B \,\perp\, \hat{\omega}_B$

Step 6. Choose any unit vector $\hat{u}_C \, \in \, \mathbb{R}^3$ such that $$|\hat{u}_C| = 1 \,\,\,\text{ and } \,\,\, \hat{u}_C \cdot \hat{\omega}_C = 0$$ i.e. $\hat{u}_C \,\perp\, \hat{\omega}_C$

Step 7. $\hat{v}_B = \hat{\omega}_B \times \hat{u}_B$

Step 8. $\hat{v}_C = \hat{\omega}_C \times \hat{u}_C$

Step 9. Define the orthogonal matrix $U_B = \big[\,\hat{u}_B \,\,\, \hat{v}_B \,\,\, \hat{\omega}_B\,\big] \, \in \, \text{SO}(3)$ where the vectors are arranged as column vectors.

Step 10. Define the orthogonal matrix $U_C = \big[\,\hat{u}_C \,\,\, \hat{v}_C \,\,\, \hat{\omega}_C\,\big] \, \in \, \text{SO}(3)$ where the vectors are arranged as column vectors.

Step 11. Define the Euclidean congruence transformation (rotation with translation) $$y = A_B \, x = U_B \, x \, + \, p_B$$

Step 12. Define the Euclidean congruence transformation (rotation with translation) $$z = A_C \, x = U_C \, x \, + \, p_C$$

Step 13. Invert the latter transformation $$x = A_C^{-1} \, z = U_C^T \, z \, - \, U_C^T\,p_C$$

Step 14. The transformation $A$ is then defined as $$A = A_B \circ A_C^{-1}$$ or in matrix form $$A = \begin{bmatrix} U_B & p_B \\ 0 & 1\end{bmatrix} \, \begin{bmatrix} U_C^T & - U_C^T\, p_C \\ 0 & 1\end{bmatrix}$$

Why I think this works. The matrix equation $A^{-1}BA = C$ is a conjugation relation between the two matrices $B$ and $C$. Geometrically, it means that there is a Euclidean transformation $T$ in 3D space and two different coordinate systems $\mathcal{B}$ and $\mathcal{C}$ such that the matrix representation of $T$ with respect to $\mathcal{B}$ is $B$ and the matrix representation of $T$ with respect to $\mathcal{C}$ is $C$. The task is then to find a Euclidean change of coordinates $A : \mathcal{C} \to \mathcal{B}$, which consists of a rotation followed by translation, that transforms the coordinate system $\mathcal{C}$ into the coordinate system $\mathcal{B}$. Then, the matrix $B$ is transformed to matrix $C$ by the conjugation $A^{-1}BA=C$. If the matrices $B$ and $C$ are two different representation in two different coordinate system of the same Euclidean transformation $T$, then I can look for a third coordinate system, let's call it $\mathcal{D}$, which is intrinsically defined by the map $T$, together with the changes of coordinates $$ A_B : \mathcal{D} \to \mathcal{B}$$ $$ A_C : \mathcal{D} \to \mathcal{C}$$ so that $$A_B^{-1}BA_B = A_C^{-1}CA_C$$

The intrinsic coordinate system I am thinking about is related to the screw-motion interpretation of any orientation Euclidean transformation. By a classical result of 3D Euclidean geometry, every 3D orientation Euclidean transformation, like $T$ for example, has an invariant line $l$ such that $T(l) = l$. Moreover, $T$ performs a rotation around the axis $l$ at an angle $\theta$, followed by a translation along $l$ to a distance $t$. Viewed in coordinate system $\mathcal{B}$, the axis $l$ of $T$ is given by the unit vector $\mathcal{\omega}_B$ and a point $p_B$ on $l$. Analogously, viewed in coordinate system $\mathcal{C}$, the axis $l$ of $T$ is given by the unit vector $\mathcal{\omega}_C$ and the point $p_C$ on $l$. You can think of $\mathcal{\omega}_C$ and $p_C$ as the same vector and point as $\mathcal{\omega}_B$ and the point $p_B$, just written in different coordinates, $\mathcal{C}$ and $\mathcal{B}$ respectively. So it makes sense to select an intrinsic to $T$ coordinate system $\mathcal{D}$ with vertical $z-$axis coinciding with $l$ and origin, a point on it. Then the other two coordinate axis $x-$axis and $y-$axis can be selected arbitrarily so that both are perpendicular to $l$ and are perpendicular to each other.

In the coordinate system $\mathcal{B}$, I have chosen the coordinate system $\mathcal{D}$ with point of origin $p_B$ and orthonormal basis vectors $\hat{u}_B \, \hat{v}_B \, \hat{\omega}_B$. Therefore, the coordinate change $A_B : \mathcal{D} \to \mathcal{B}$ looks like $A_B = \begin{bmatrix} U_B & p_B \\ 0 & 1\end{bmatrix}$, with $U_B = \big[\,\hat{u}_B \,\,\, \hat{v}_B \,\,\, \hat{\omega}_B\,\big] \, \in \, \text{SO}(3)$. Then, in order to represent $B$ in the coordinates of $\mathcal{D}$, we conjugate $A_{B}^{-1} B A_B = B_{\mathcal{D}}$ and in the new coordinate system $\mathcal{D}$ the transofrmation $B_{\mathcal{D}}$, by construction of $\mathcal{D}$, is a rotation around the vertical $z-$axis at an angle $\theta$ and is a translation along the same $z-$axis. Hence $$B_{\mathcal{D}} = A_{B}^{-1} B A_B = \begin{bmatrix} \cos(\theta) & - \sin(\theta) & 0 & 0 \\ \sin(\theta) & \cos(\theta) & 0 & 0 \\ 0 & 0 & 1 & t \\ 0 & 0 & 0 & 1 \end{bmatrix}$$
Absolutely analogously, in the coordinate system $\mathcal{C}$, I have chosen the coordinate system $\mathcal{B}$ with point of origin $p_C$ and orthonormal basis vectors $\hat{u}_C \, \hat{v}_C \, \hat{\omega}_C$. Therefore, the coordinate change $A_C : \mathcal{D} \to \mathcal{C}$ looks like $A_C = \begin{bmatrix} U_C & p_C \\ 0 & 1\end{bmatrix}$, with $U_C = \big[\,\hat{u}_C \,\,\, \hat{v}_C \,\,\, \hat{\omega}_C\,\big] \, \in \, \text{SO}(3)$. Then, in order to represent $C$ in the coordinates of $\mathcal{D}$, we conjugate $A_{C}^{-1} C A_C = C_{\mathcal{D}}$ and in the new coordinate system $\mathcal{D}$ the transofrmation $C_{\mathcal{D}}$, by construction of $\mathcal{D}$, is a rotation around the vertical $z-$axis at an angle $\theta$ and is a translation along the same $z-$axis. Hence $$C_{\mathcal{D}} = A_{C}^{-1} C A_C = \begin{bmatrix} \cos(\theta) & - \sin(\theta) & 0 & 0 \\ \sin(\theta) & \cos(\theta) & 0 & 0 \\ 0 & 0 & 1 & t \\ 0 & 0 & 0 & 1 \end{bmatrix}$$
Hence $B_{\mathcal{D}} = C_{\mathcal{D}}$ and therefore you have the identity $$A_{B}^{-1} B A_B = A_{C}^{-1} C A_C$$ which iyelds the final result from step 14 in the algorithm: $$\big(A_{C} \, A_{B}^{-1}\big) B \big(A_{B} \, A_{C}^{-1}\big) = \big(A_{B} \, A_{C}^{-1}\big)^{-1} B \big(A_{B} \, A_{C}^{-1}\big) = A^{-1} B A = C$$ with $A = A_{B} \, A_{C}^{-1}$

As you can see, by construciton there is a whole family of solutions (i.e. gagues generated by the stabilizer of the adjoint orbit of $B$ in the Euclidean group) coming from the arbitrary choices of the points of origin $p_B$ and $p_C$ as well as the $x-$axis basis vectors $\hat{u}_B$ and $\hat{u}_C$.

Finally, both matrices $B$ and $C$ should have the same geometric invariants angle of rotation $\theta_B, \, \theta_C$ and translation distance $t_B, \, t_C$ for both of them to be related by conjugacy. If that is not the case, i.e. $\theta_B, \neq \theta_C$ or $t_B, \neq t_C$, you can still apply this algorithm and get a resulting matrix $A^{-1} B A$ having the same rotation axis as $C$ just slightly different screw motion parameters. You can think that this is a kind of optimal configuration for these two matrices in terms of their being different.