Sylvester Equation over GF(2)

415 Views Asked by At

I know that a Sylvester equation $$AX+XB=0$$ has nontrivial solutions exactly when there is a common eigenvalue of $A$ and $-B$. This is because if there is a common eigenvalue $k$, then there exists a column eigenvector $v$ of $A$ and a row eigenvector $w$ of $-B$. Then $vw$ is a nontrivial solution to the equation: $$A(vw)+(vw)B=(Av)w+v(wB)=kvw-kvw=0$$

But what if we are working with $GF(2)$? Suppose the characteristic polynomials of $A$ and $-B$ are both $x^2+x+1$. This polynomial is irreducible over $GF(2)$, so $A$ and $-B$ do not share an eigenvalue in $GF(2)$. But they share an eigenvalue in $\mathbb{C}$, so does this mean there is a nontrivial solution in $GF(2)$ or not? The argument above certainly doesn't work because $vw$ isn't necessarily in $GF(2)$.

2

There are 2 best solutions below

10
On BEST ANSWER

Let us assume that the characteristic polynomials $\chi_A$ and $\chi_{-B}$ have a common irreducible factor $p$. Let $C$ be the companion matrix of $p$. Then $A$ is similar to an upper triangular block matrix with $C$ in the upper left corner and $-B$ is similar to a lower triangular block matrix with $C$ in the upper left corner (other diagonal blocks can be chosen as the companion matrices of the other irreducible factors of the characteristic polynomials.) Hence we have $$ A=S_A \begin{pmatrix} C & \star & \dots & \star \\ 0 & \star & \ddots & \vdots \\ \vdots & \ddots & \ddots & \star \\ 0 & \dots & 0 & \star \end{pmatrix} S_A^{-1} $$ and $$ -B=S_B^{-1} \begin{pmatrix} C & 0 & \dots & 0 \\ \star & \star & \ddots & \vdots \\ \vdots & \ddots & \ddots & 0 \\ \star & \dots & \star & \star \end{pmatrix} S_B $$ Now choose $$ X=S_A \begin{pmatrix} I & 0 & \dots & 0 \\ 0 & 0 & \ddots & \vdots \\ \vdots & \ddots & \ddots & 0 \\ 0 & \dots & 0 & 0 \end{pmatrix} S_B $$ where $I$ is the identity matrix of the same size as $C$. Then $$ AX = X(-B) = S_A \begin{pmatrix} C & 0 & \dots & 0 \\ 0 & 0 & \ddots & \vdots \\ \vdots & \ddots & \ddots & 0 \\ 0 & \dots & 0 & 0 \end{pmatrix} S_B $$ Note: We only need the first $d$ columns of $S_A$ and the first $d$ rows of $S_B$, where $d$ is the degree of $p.$ The columns of $S_A$ can be found by choosing one vector $v$ of the kernel of $p(A)$ and use $A^kv,\; k=0,\ldots d-1$ as the first $d$ columns of $S_A.$ (Similar approach for $S_B$, but we must pay attention to get the same $C$ and not its transpose.)

Edit

We have to find $d$ column vectors $v_0,\ldots,v_{d-1}$ such that $$ A \begin{pmatrix} & & \\ v_0 & \dots & v_{d-1} \\ & & \end{pmatrix} = \begin{pmatrix} & & \\ v_0 & \dots & v_{d-1} \\ & & \end{pmatrix} C $$ and $d$ row vectors $w_0,\ldots,w_{d-1}$ such that $$ \begin{pmatrix} & w_{0} & \\ \;\; & \vdots & \;\; \\ & w_{d-1} & \end{pmatrix} (-B) = C \begin{pmatrix} & w_{0} & \\ \;\; & \vdots & \;\; \\ & w_{d-1} & \end{pmatrix} $$ Let $$ C=\begin{pmatrix} 0 & & 0 & -a_0 \\ 1 & \ddots & & -a_1 \\ & \ddots & 0 & \vdots \\ 0 & & 1 & -a_{d-1} \end{pmatrix} $$ where $a_0,\ldots,a_{d-1}$ are the coefficients of $p$, such that $p(t)=t^d+a_{d-1}t^{d-1}+\ldots+a_0.$

Let $v_0$ be a vector in $\ker(p(A))$ and $v_k=A^kv_0.$ As $p(A)v_0=0,$ it can easily be seen that $$ A \begin{pmatrix} & & \\ v_0 & \dots & v_{d-1} \\ & & \end{pmatrix} = \begin{pmatrix} & & & \\ v_1 & \dots & v_{d-1} & A^dv_0 \\ & & & \end{pmatrix} \\ = \begin{pmatrix} & & & \\ v_1 & \dots & v_{d-1} & (A^d-p(A))v_0 \\ & & & \end{pmatrix} = \begin{pmatrix} & & \\ v_0 & \dots & v_{d-1} \\ & & \end{pmatrix} C $$ We can also show that the $v_k$ are linearly independent. Assume there was a non-trivial linear combination $\sum_{k=0}^{d-1}\alpha_kv_k=0.$ This is equivalent with the statement that there is a polynomial $q$ of degree less than $d$ with $q(A)v_0=0.$ $q$ and $p$ must be coprime, because $p$ is irreducible. But if $q(A)x_1=0$ and $p(A)x_2=0$ for comprime $p$ and $q$ and $x_1\neq 0$ and $x_2\neq 0,$ then $x_1$ and $x_2$ are linearly independent, see here. We have a contradiction, because in our case $p(A)v_0=q(a)v_0=0.$ There cannot be a non-trivial linear combination $\sum_{k=0}^{d-1}\alpha_kv_k=0,$ hence the vectors $v_0,\ldots, v_{d-1}$ are linearly independent.

We select $w_{d-1}$ from the "left kernel" of $p(-B),$ such that $w_{d-1}\,p(-B)=0.$

Let $w_{k-1}=-w_kB + a_kw_{d-1}$ for $k=d-1,d-2,\ldots,0.$

Then $w_{-1}=w_{d-1}\,p(-B) = 0.$ Example ($d=4$): \begin{eqnarray*} w_2 = -w_3B+a_3w_3 &=& w_3(-B+a_3I) \\ w_1 = -w_2B+a_2w_3 &=& w_3((-B+a_3I)(-B)+a_2I) \\ w_0 = -w_1B+a_1w_3 &=& w_3(((-B+a_3I)(-B)+a_2I)(-B)+a_1I) \\ w_{-1}= -w_0B+a_0w_3 &=& w_3((((-B+a_3I)(-B)+a_2I)(-B)+a_1I)(-B)+a_0I) = w_3p(-B) = 0 \end{eqnarray*} see Horner's method. Then we have $$ \begin{pmatrix} & w_{0} & \\ \;\; & \vdots & \;\; \\ & w_{d-1} & \end{pmatrix} (-B) = \begin{pmatrix} & -w_0B & \\ & -w_1B & \\ & \vdots & \\ & -w_{d-1}B & \end{pmatrix} \\ = \begin{pmatrix} & -a_0w_{d-1} & \\ & w_0-a_1w_{d-1} & \\ & \vdots & \\ & w_{d-2}-a_{d-1}w_{d-1} & \end{pmatrix} = C \begin{pmatrix} & w_{0} & \\ \;\; & \vdots & \;\; \\ & w_{d-1} & \end{pmatrix} $$ It can also be shown that the row vectors $w_0,\ldots,w_{d-1}$ are linearly independent, in the same way as we have shown that $v_0,\ldots,v_{d-1}$ are linearly independent. This linear independence guarantees $X\neq 0$ when we set $$ X= \begin{pmatrix} & & \\ v_0 & \dots & v_{d-1} \\ & & \end{pmatrix} \cdot \begin{pmatrix} & w_{0} & \\ \;\; & \vdots & \;\; \\ & w_{d-1} & \end{pmatrix} $$ The complete construction works for any field $F.$

7
On

For any $n\times n$ square matrix $A$ with entries in a field $F$, let $M^A$ be the vector space $F^n$ with $F [x]$-module structure in which the multiplication by $x$ is applying the multiplication by $A$ on the left. Let $A$ be $m\times m$ matrix over the field $F$, and $B$ be $n\times n$ matrix over the field $F$, $X$ be $m\times n$ matrix over $F$. Let $C_{A,B}=\{X \in M_{m\times n}(F)| AX+XB=0\}$. Then the solution $X$ to the Sylvester's equation $AX+XB=0$ can be interpreted as a module homomorphism from $M^{-B}$ to $M^A$. This can be summarized as $$ C_{A,B} \simeq \mathrm{Hom}_{F[x]} (M^{-B},M^A) $$ as $F$-vector spaces.

The RHS is nontrivial if and only if $M^{-B}$ and $M^A$ has a common irreducible polynomial as the invariant factors.

In your case, the field is $F=\mathrm{GF}(2)$, and both $M^A$, $M^{-B}$ have $x^2+x+1$ as invariant factors. Thus, the RHS is nontrivial. Hence, $C_{A,B}$ is nontrivial.