Question on The Theorem of Existence and Uniqueness of The Solutions to Sylvester Equations

Question

Question on The Theorem of Existence and Uniqueness of The Solutions to Sylvester Equations

46 Views Asked by Bumbble Comm At 13 Apr 2026 - 10:05

Background

I am self-studying linear algebra, and I got stuck on some steps of the proof of the following theorem:

Theorem$\quad$ Let $A$ be an $n \times n$ complex matrix and $B$ be an $m \times m$ complex matrix. For any $n \times m$ complex matrix $C$, the Sylvester equation $AX+XB=C$ has a unique solution $X$, which is an $n \times m$ complex matrix, if and only if $A$ and $-B$ do not share any eigenvalue.

Here is the proof:

Proof$\quad$ The equation $AX+XB=C$ is a linear system with $mn$ unknowns and $mn$ equations. Hence, it is uniquely solvable for any given $C$ if and only if the homogeneous equation $AX+XB=0$ admits only the trivial solution.

First suppose that $A$ and $-B$ do not share any eigenvalue. Let $X$ be a solution to the homogeneous system $AX+XB=0$. Then $AX=X(-B)$. Since \begin{align*} &A(AX) = A(X(-B))\\ \implies\ &(AA)X = (AX)(-B) = (X(-B))(-B) = X((-B)(-B))\\ \implies\ &A^2X = X(-B)^2, \end{align*} by mathematical induction, we have $A^kX=X(-B)^k$ for each $k\in\mathbb{N}$. Then, $p(A)X = Xp(-B)$ for any polynomial $p$. In particular, let $p$ be the characteristic polynomial of $A$; that is, let $p(r) = det(A-rI)$. Then, $p(A) = 0$. If $E$ is any square matrix, let $\sigma(E)$ denote the set of eigenvalues of $E$. Then, $\sigma(p(-B)) = p(\sigma(-B))$. Since $A$ and $-B$ do not share any eigenvalue, $p(\sigma(-B))$ does not contain zero, and hence, $p(-B)$ is nonsigular. Thus, $X$ is the zero matrix.

Conversely, suppose that $A$ and $-B$ share an eigenvalue $\lambda$. Let $\mathbf{u}$ be a corresponding right eigenvector for $A$ and $\mathbf{v}$ be a corresponding left eigenvector for $-B$; that is $A\mathbf{u}=\lambda\mathbf{u}$ and $\mathbf{v}(-B) = \lambda\mathbf{v}$. Let $X=\mathbf{u}\mathbf{v}^*$. Then, $X$ is not the zero matrix, and $AX+XB = A(\mathbf{u}\mathbf{v}^*) - (\mathbf{u}\mathbf{v}^*)(-B) = \lambda\mathbf{u}\mathbf{v}^* - \lambda\mathbf{u}\mathbf{v}^* = 0$. Therefore, $X$ is a nontrovial solution to the homogeneous system $AX+XB=0$.

My Questions

I could not understand certain steps in the above proof.

For the "if" direction (the second paragraph in the proof):

After it proved that $A^kX=X(-B)^k$, it claimed that $p(A)X=Xp(-B)$ for any polynomial. Why is that?
After it defined $p$ to be the characteristic polynomial, it said that $p(A)=0$. Is this because one might just plug in $A$ to $p(r)$ to get $p(A) = det(A-AI) = det(0) = 0$?
I have difficulties understanding the notation of $p(\sigma(-B))$. While the $\sigma(-B)$ is the set of eigenvalues of $-B$, what is $p(\sigma(-B))$ then? Is it the set of the image of the eigenvalues of $-B$ under the polynomial $p$? If so, does $\sigma(p(-B)) = p(\sigma(-B))$ mean that the two sets are equal?
Why does the fact that $A$ and $-B$ do not share any eigenvalue imply that $p(\sigma(-B))$ does not contain zero, and how does it implies that $p(-B)$ is nonsingular?
Why does the nonsigularity of $p(-B)$ imply that $X$ is the zero matrix?

For the "only if" part (the third paragraph in the above proof):

I am not familiar with the notation $\mathbf{v}^*$. Is it the conjugate transpose of $\mathbf{v}$? If so, why cannot $\mathbf{u}\mathbf{v}^*$ be the zero matrix?

I apologize for my lousy beginner linear algebra background. I would really appreciate it if someone could help me with these questions!

Original Q&A

There are 2 best solutions below

Bumbble Comm On 06 Mar 2024 - 4:52

Let $p(x)=\sum_{k=0}^mc_kx^k$. Then $$ p(A)X=\sum_{k=0}^mc_kA^kX=\sum_{k=0}^mc_kX(-B)^k=X\sum_{k=0}^mc_k(-B)^k=Xp(-B). $$
No. This is the assertion of Cayley-Hamilton theorem. Your reasoning is the typical incorrect one. Note that $p(A)$ is an $n\times n$ matrix but $\det(A-AI)$ is a complex number. They are two different kinds of things.
Yes. Your interpretations are correct.
Factorise $p(x)$ as $\prod_{k=1}^n(x-\lambda_k)$. Then $\sigma(A)=\{\lambda_1,\ldots,\lambda_n\}$ and $p(-B)=\prod_{k=1}^n(-B-\lambda_kI)$. For any number $\mu$, note that $p(\mu)=0$ if and only if $\mu=\lambda_k$ for some $k$. Now the following statements are equivalent:
- $p(-B)$ is singular;
- $-B-\lambda_kI$ is singular for some $k$;
- $\lambda_k\in\sigma(-B)$ for some $k$, i.e., $A$ and $-B$ shares an eigenvalue $\lambda_k$;
- $p(\mu)=0$ for some $\mu\in\sigma(-B)$;
- $0\in p(\sigma(-B))$.
Here $p$ is the characteristic polynomial of $A$. We have $p(A)=0$ (Cayley-Hamilton theorem) and $p(A)X=Xp(-B)$. Hence $Xp(-B)=0$. If $p(-B)$ is nonsingular, then $X=Xp(-B)p(-B)^{-1}=0p(-B)^{-1}=0$.

Finally, yes, $v^\ast$ denotes the conjugate transpose of $v$. Since $u$ and $v$ are nonzero (because they are eigenvectors), $uv^\ast$ cannot possibly be the zero matrix. (Otherwise, for instance, we would arrive at the contradiction that $0=u^\ast(0)v=u^\ast(uv^\ast)v=(u^\ast u)(v^\ast v)=\|u\|_2^2\|v\|_2^2>0$. Alternatively, since $u$ and $v$ are nonzero vectors, $u_i$ and $v_j$ are nonzero for some $i$ and $j$. Therefore $(uv^\ast)_{ij}=u_i\overline{v}_j\ne0$. Hence $uv^\ast\ne0$.)

**Bumbble Comm** · Accepted Answer

Short version: $X$ distributes over the additions in $p(A)$ and then in each term, you swap the $A^k X$ for an $X(-B)^k$, then undistributed the $X$.
In detail: Let $N \in \Bbb{N}$ and $p(A) = \sum_{i=0}^N c_i A^i$. Then \begin{align*} p(A)X &= \sum_{i=0}^N c_i A^i X \\ &= \sum_{i=0}^N c_i X (-B)^i \\ &= X \sum_{i=0}^N c_i (-B)^i \\ &= X p(-B) \text{.} \end{align*}
Note that $p(A)X$ is square. Then this is the Cayley-Hamilton theorem.
$p(\sigma(-B))$ is the set of values $p(e_1)$, $p(e_2)$, ... $p(e_N)$ where $-B$ has $N$ eigenvalues (with or disregarding repetition; the set is the same either way) named $e_1$, $e_2$, ..., $e_N$. This uses fairly common notation: for $f$ a function and $S$ a set, $f(S) = \{f(s) \mid s \in S\}$. And yes, $\sigma(p(-B)) = p(\sigma(-B))$ means the two sets are equal: "the set of eigenvalues of the matrix $p(-B)$ is the set of values obtained by evaluating $p$ at the eigenvalues of $-B$". (The original complaint here was based on misremembering which matrices were non-square.)
The roots of the characteristic polynomial are its eigenvalues. So, having taken $p$ to be the characteristic polynomial of $A$, $p$ is zero on each of $A$'s eigenvalues. (Additional theory you should know: (1) the characteristic polynomial need not be the minimal (degree) polynomial, (2) if it is not the minimal polynomial the only "additional" roots are repetitions of the minimal polynomial's roots. Consequently, $p$ is zero exactly on the eigenvalues of $A$ and is nonzero elsewhere.) Since $A$ and $B$ do not share eigenvalues, $p$ cannot be zero on any of $B$'s eigenvalues. That is $0 \not\in p(\sigma(-B))$, so $0 \not\in \sigma(p(-B))$ (by the argument two-ish sentences prior in the Proof), so $p(-B)$ is nonsingular. (To be singular, one or more eigenvalues is zero.)
When we took $p$ to be the characteristic polynomial of $A$, this simplified $p(A)X = X(-B)$ to $0 = X(-B) = -XB$, equivalently $XB = 0$. Since $B$ has only nonzero eigenvalues, its nullspace is trivial, so all the rows of $X$ must be the zero vector. (In other words, since $B$ is nonsingular, for it to send a row of $X$ to the zero vector, that row of $X$ must be the zero vector. Since $B$ sends all the rows of $X$ to the zero vector, they must all be the zero vector so $X$ is the zero matrix.)
(I probably would have numbered this "6" since you were numbering question blobs.) Just reading through how it's used, $\mathbf{v}^*$ only needs to be the transpose of $\mathbf{v}$. (And it must at least be the transpose for $X$ to have the correct dimensions.)
The zero vector is never an eigenvector, so neither $\mathbf{u}$ nor $\mathbf{v}$ is the zero vector, so their (outer) product is not the $0$ matrix.

Question on The Theorem of Existence and Uniqueness of The Solutions to Sylvester Equations

Background

My Questions

There are 2 best solutions below

Related Questions in LINEAR-ALGEBRA

Related Questions in MATRICES

Related Questions in POLYNOMIALS

Related Questions in EIGENVALUES-EIGENVECTORS

Related Questions in DETERMINANT

Trending Questions

Popular # Hahtags

Popular Questions