Closed form solution to matrix equation

155 Views Asked by At

I'm trying to solve the following matrix equation for $L$:

$$A\cdot L + L^T = 0$$ where A is non-singular and both $A,L \in \mathbb{R}^{n\times n}$. I'm wondering if this could have a closed-form solution.

This seems close to Sylvester's equation but not quite the same.

3

There are 3 best solutions below

8
On

I begin by three examples showing that the solutions of

$$AL+L^T=0\tag{1}$$

can depend on parameters.

a) If we take

$$A=\pmatrix{1&0&0\\0&1&1\\0&0&-1},$$

then the general solution of (1) is :

$$L=\pmatrix{0&a&0\\-a&b&-2b\\0&-2b&4b},$$

with $\det(L)=4a^2b$.

b) If $$A=\pmatrix{1&0&0&0\\0&0&1&0\\0&1&0&0\\0&0&0&1}$$ (in fact it is a commutation matrix that will be used later on), the general solution depends on 4 parameters :

$$L=\pmatrix{0&a&a&c\\-a&b&-b&d\\-a&-b&b&d\\-c&-d&-d&0}$$

with $\det(L)=0.$

c) For the $6 \times 6$ matrix :

$$A=\pmatrix{1&0&0&0&0&0\\0&0&0&0&0&1\\0&0&0&0&1&0\\0&0&0&1&0&0\\0&0&1&0&0&0\\0&1&0&0&0&0}$$

the general solution depends on $9 $ (!) parameters :

$$L=\pmatrix{0&e&b&a&b&e\\ -e&i&-f&-g&-h&-i\\ -b&h&d&-c&-d&f\\ -a&g&c&0&c&g\\ -b&f&-d&-c&d&h\\ -e&-i&-h&-g&-f&i}$$

with the interesting factorization of

$$\det(L)=(2ce - 2bg + af + ah)^2(2fh + 4di - f^2 - h^2)$$

Please note that in the three examples, the solutions are the sum of a symmetric matrix and a skew-symmetric matrix (see the connection with the answer by @user1551).


How such solutions can be obtained ? By the following method :

Begin by "vectorizing" relationship (1) :

$$vec(AL)+vec(L^T)=vec(0)\tag{2}$$

(vectorization of a $n \times n$ matrix means : "stacking its columns into a $n^2 \times 1$ vector").

Based on classical relationship $vec(AXB)=(B^T \otimes A) vec(X)$ where $\otimes$ is Kronecker product, (2) can be written under the form :

$$(I_n \otimes A) . vec(L) + U . vec(L) = vec(0),\tag{3}$$

where $C$ is the ($n^2 \times n^2$) commutation matrix : see here or here.

$$\underbrace{(I_n \otimes A+C)}_D . vec(L) = vec(0),$$

Otherwise said, $vec(L)$ is any vector of the kernel of $$D:=I_n \otimes A + C,$$

with possible dimensions $0,1,2 ...$ (even more than $n$) in the examples given above, the dimensions are $2$ and $4$ resp.). The dimension of the kernel can be

This gives you a way to obtain a general form for $L$ using software tools where Kronecker product is implemented (see general Matlab program below).

Remark : Writing the initial relationship under the form

$$AL=-L^T,$$

and taking the determinant on both sides, one sees that either $\det(L)=0$ or, if $\det(L) \ne 0$, a necessary condition on $A$ for having solutions to (1) is : $\det(A)=(-1)^n$.

Here is a (Matlab) program with the matrix of the first example, but which can be used for any matrix :

 A=[1,0,0;
    0,1,1
    0,0,-1];
 n=size(A,1);
 I=eye(n); % identity matrix n x n
 % creation of the commutation matrix C at order n :
 Col=[];
 for p=1:n;Col=[Col,p+n*(0:n-1)];end;
 I2=eye(n^2);C=I2(:,Col); 
 D=kron(I,A)+C;
 Kb=null(D,'rational'); % basis of the kernel in a n x d array
 d=size(Kb,2);
 disp('kernel dimension : '),d
 if d>0
    disp('the gen. sol. is a linear combination of :')
    for k=1:d;reshape(Kb(:,k),n,n);end;
 else disp('therefore no non trivial solution')
 end;
8
On

Here we give an explicit general solution when $A$ is diagonalisable over $\mathbb C$. In this case, $A$ possesses a real Jordan form, i.e., $A=M\Lambda M^{-1}$ for some real invertible matrix $M$ and real block-diagonal matrix $$ \Lambda=\operatorname{diag}(B_1,B_2,\ldots,B_k) $$ where $B_i$ and $B_j$ have not any common complex eigenvalue whenever $i\ne j$, and $$ B_i=r_iI\ \text{ or }\ B_i=|r_i|\operatorname{diag}(R_{\theta_i},R_{\theta_i},\ldots,R_{\theta_i}),\tag{0} $$ for some real number $r_i$ and for some $2\times 2$ rotation matrix $R_{\theta_i}$ for an angle $0<\theta_i<\pi$. Let $X=M^{-1}L(M^{-1})^T$. The equation $AL=-L^T$ then becomes $\Lambda X=-X^T$.

Partition $X$ as a $k\times k$ block matrix, so that its $i$-th diagonal sub-block has the same size as $B_i$. Denote the $(i,j)$-th sub-block of $X$ by $X^{ij}$. The equation $\Lambda X=-X^T$ thus means that $B_iX^{ij}=(-X^{ji})^T$ for all $(i,j)$, i.e., $$ X^{ji}=(-B_iX^{ij})^T\tag{1} $$ for every pair of indices $i$ and $j$. By interchanging the two indices, we also have $$ X^{ij}=(-B_jX^{ji})^T.\tag{2} $$ Therefore $$ X^{ij}=(-B_jX^{ji})^T =-(X^{ji})^TB_j^T =-(-B_i X^{ij})B_j^T =B_iX^{ij}B_j^T. $$ It follows that $X^{ij}=0$ whenever $I-B_j\otimes B_i$ is nonsingular, i.e., whenever $\lambda_i\overline{\lambda_j}\ne1$ for every eigenvalue $\lambda_i$ of $B_i$ and every eigenvalue $\lambda_j$ of $B_j$.

Now suppose that $i<j$ and $\lambda_i\overline{\lambda_j}=1$ for some eigenvalue $\lambda_i$ of $B_i$ and some eigenvalue $\lambda_j$ of $B_j$. There are only two possibilities:

  • $(B_i,B_j)=(rI,\frac{1}{r}I)$ for some $r\not\in\{0,1,-1\}$. In this case, $(1)$ and $(2)$ become two equivalent conditions $X^{ji}=(-rX^{ij})^T$ and $X^{ij}=(-\frac{1}{r}X^{ji})^T$. Therefore one may assign an arbitrary value to $X^{ij}$ and set $X^{ji}=(-rX^{ij})^T$.
  • $B_i=|r|\operatorname{diag}(R_{\theta},R_{\theta},\ldots,R_{\theta})$ and $B_j=\frac{1}{|r|}\operatorname{diag}(R_{\theta},R_{\theta},\ldots,R_{\theta})$. In this case, we have $B_j^T=\frac{1}{|r|^2}B_j^{-1}$. Therefore from $(2)$ we obtain $X^{ij}=(-B_jX^{ji})^T=-(X^{ji})^TB_j^T=-\frac{1}{|r|^2}(X^{ji})^TB_j^{-1}$, i.e., $|r|^2X^{ij}B_j=(-X^{ji})^T$. From $(1)$, however, we also obtain $(-X^{ji})^T=B_iX^{ij}$. Therefore $(1)$ and $(2)$ are consistent if and only if $$ |r|^2X^{ij}B_j=B_iX^{ij}.\tag{3} $$ Now partition $X^{ij}$ into sub-blocks of size $2\times2$. For each sub-block $Y$ of $X^{ij}$, $(3)$ gives $|r|^2Y\left(\frac{1}{|r|}R_{\theta}\right)=|r|R_{\theta}Y$. That is, $YR_{\theta}=R_{\theta}Y$. Hence each sub-block of $X^{ij}$ is an arbitrary scalar multiple of some rotation matrix (a.k.a. a rotation-dilation) and $X^{ji}$ is given by $(1)$.

Next, suppose that $i=j$ and $\lambda_i\overline{\lambda_j}=1$ for some two eigenvalues $\lambda_i,\lambda_j$ of $B_i$. Since $B_i$ is in the form of $(0)$, we must have $B_i=\pm I$ or $\operatorname{diag}(R_{\theta},R_{\theta},\ldots,R_{\theta})$ for some $\theta\in(0,\pi)$:

  • If $B_i=I$, $(1)$ means $X^{ii}$ is an arbitrary skew-symmetric matrix.
  • If $B_i=-I$, $(1)$ means $X^{ii}$ is an arbitrary symmetric matrix.
  • If $B_i=\operatorname{diag}(R_{\theta},R_{\theta},\ldots,R_{\theta})$, partition $X^{ii}$ into sub-blocks of size $2\times2$. By a similar argument to a previous one, we see that each sub-block $Y$ in $X^{ii}$ must commute with $R_{\theta}$, i.e., it is a rotation-dilation. The strictly block-lower triangular part of $X^{ii}$ is completely determined by the strictly block-upper triangular part with $(1)$. That is, if $Y$ is a sub-block in the strictly upper triangular part, then the sub-block at the symmetric position is $-Y^TR_{\theta}^T$ (which is also equal to $-R_{\theta}^TY^T$ because $R_{\theta}$ and $Y$ commute). For each diagonal sub-block of $X^{ii}$, let it be $W=cR_{\phi}$. Condition $(1)$ gives $W=(-R_{\theta}W)^T$, i.e., $R_{\theta}W=-W^T$. Hence $\theta+\phi=m\pi-\phi$ for some odd integer $m$. This means $\phi=(m\pi-\theta)/2$ and $R_\phi=\pm R_{(\pi-\theta)/2}$. By absorbing the sign into $c$, we see that $W=cR_{(\pi-\theta)/2}$ for some real number $c$.
0
On

This is the Sylvester-Transpose matrix equation sometimes called the T-Sylvester equation, analyzed in The solution of the equation $AX + X^* B = 0$ by Teran,Dopico

If you want to implement a solver yourself, an easier approach is to reduce it to continuous Lyapunov equation, derivation is on mathematica.SE post.

There may be an infinite number of solutions to the continuous Lyapunov equation, but if we restrict attention to the least-squares solution, it has an elegant closed form solution in terms of eigenvectors of $A$, proved here by user1551