Closed form solution for block matrix A in $AXA^T=C$

442 Views Asked by At

Looking for help for closed form solution for A, given C and X in the matrix equation

$AXA^T=C$

A (size $M x N$), X ($N x N$) and C ($M x M$) are (appropriately sized) block matrices $A = \begin{pmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \\ \end{pmatrix} $, $X = \begin{pmatrix} X_{11} & X_{12} \\ X_{21} & X_{22} \\ \end{pmatrix}$, $C = \begin{pmatrix} C_{11} & C_{12} \\ C_{21} & C_{22} \\ \end{pmatrix}$

X and C are covariance matrices (positive semi definite) with $M \leq N$. I am interested in closed form solutions because I need to make some statements about the block matrices that compose A.

Simply multiplying out seems a dead end:

$\begin{pmatrix} (A_{11}X_{11}+A_{12}X_{21})A^T_{11}+(A_{11}X_{12}+A_{12}X_{22})A^T_{12} & (A_{11}X_{11}+A_{12}X_{21})A^T_{21}+(A_{11}X_{12}+A_{12}X_{22})A^T_{22} \\ (A_{21}X_{11}+A_{22}X_{21})A^T_{11}+(A_{21}X_{12}+A_{22}X_{22})A^T_{12} & (A_{21}X_{11}+A_{22}X_{21})A^T_{21}+(A_{21}X_{12}+A_{22}X_{22})A^T_{22} \\ \end{pmatrix} = \begin{pmatrix} C_{11} & C_{12} \\ C_{21} & C_{22} \\ \end{pmatrix}$

I know that when solving for X we can recast the problem as linear regression using the Kronecker product (https://en.wikipedia.org/wiki/Kronecker_product), but I don't see how I go from that to a closed form solution for A. My other intuition is that this looks a bit like some reduced DARE (https://en.wikipedia.org/wiki/Algebraic_Riccati_equation), but that's also about solving for X. I'm afraid that's all I have.

I'll also mention that I am mostly (but not only) interested in the special case $C_{12} = C_{21} = C_{11}= 0$. Any help, pointers, things to google are highly appreciated. Cheers!

1

There are 1 best solutions below

6
On

By Sylvester's law of inertia, the equation is solvable if and only if $\operatorname{rank}(C)\le\operatorname{rank}(X)$. Suppose this is the case. As $X$ and $C$ are positive semidefinite, $X=YY^T$ and $C=DD^T$ for some square matrices $Y$ and $D$. Such $Y$ and $D$ are not unique, but you may pick any choices that you see fit (such as the unique positive semidefinite square roots of $X$ and $C$). The equation can then be rewritten as $(AY)(AY)^T=DD^T$. We now consider the cases $M\ge N$ and $M\le N$ separately.

  • When $M\ge N$, we have $\pmatrix{AY&0}=DQ$ and hence $AY=DQ\pmatrix{I_N\\ 0}$ for some orthogonal matrix $Q$. It follows that the general solution is given by $$ A=DQ\pmatrix{Y^+\\ 0} + B(I_N-YY^+), $$ where $B\in M_{M\times N}(\mathbb R)$ is arbitrary and $Q$ is any $M\times M$ orthogonal matrix such that $$ DQ\pmatrix{I_N-Y^+Y\\ 0}=0. $$
  • When $M\le N$, we have $AY=\pmatrix{D&0}Q$ for some orthogonal matrix $Q$. It follows that the general solution is given by $$ A=\pmatrix{D&0}QY^+ + B(I_N-YY^+), $$ where $B\in M_{M\times N}(\mathbb R)$ is arbitrary and $Q$ is any $N\times N$ orthogonal matrix such that $$ \pmatrix{D&0}Q(I_N-Y^+Y)=0. $$

I can't think of anything meaningful to say about the subblocks of $A$. As you can see, the solutions are not unique and any structures or properties of subblocks of $A$ can be easily destroyed by taking a different $Q$.