Showing that $\mathbf{X}^{2} + \mathbf{X} = \mathbf{A}$ has a solution

248 Views Asked by At

Show that there exists some $\epsilon >0$ s.t. for all $\mathbf{A}\in \mathbb{R}^{2\times 2} $ with $|( \mathbf{A})_{i, j}| < \epsilon $ for all $i, j$ (let the space of all such matrices be $E$) the equation $$ \begin{align*} \mathbf{X}^{2} + \mathbf{X} = \mathbf{A} \end{align*} $$ has some solution $\mathbf{X}\in \mathbb{R}^{2\times 2} $.

My approach would be to use the Banach fixed point theorem: Consider the mapping $$ \begin{align*} \Phi \colon E \to \mathbb{R}^{2, 2}, \quad \Phi ( \mathbf{X}) = \mathbf{A} - \mathbf{X}^{2} .\end{align*} $$ So first we need to find an $\epsilon > 0$ s.t. $\Phi [ E]\subseteq E$. Since $$ \begin{align*} \begin{bmatrix} a & b \\ c & d \end{bmatrix} ^{2} = \begin{bmatrix} a^{2} + cb & ab + db \\ ac + c d & b c + d ^{2} \end{bmatrix} \implies \forall i, j \in \{ 1, 2\}\colon ( \mathbf{X}^{2})_{i, j} < 2 \epsilon^{2} \end{align*} $$ choosing any $\epsilon < 1 / 2$ will assure that $\Phi ( \mathbf{X}^{2}) \in E$ for $\mathbf{X} \in E$. Next we have to determine when $\Phi $ becomes a contraction. One has $$ \begin{align*} \left\| \Phi ( \mathbf{X}) - \Phi ( \mathbf{Y})\right\|_{\infty } &= \left\| \mathbf{X}^{2} - \mathbf{Y}^{2}\right\|_{\infty} \\ &= \max_{i \in \{ 1, 2\}} \sum_{j = 1}^{2} \left| ( \mathbf{X}^{2})_{i, j}- ( \mathbf{Y}^{2})_{i, j} \right| \\ &\leqslant \max_{i \in \{ 1, 2\}} \sum_{j = 1}^{2}\!\left( \left| ( \mathbf{X}^{2})_{i, j}\right|+ \left| ( \mathbf{Y}^{2})_{i, j} \right| \right) \\ & < 2 \!\left( 2 \epsilon ^{2} + 2 \epsilon ^{2}\right) = 8 \epsilon ^{2} \end{align*} $$ and $\left\| \mathbf{X} - \mathbf{Y}\right\|_{\infty} < 4\epsilon $ meaning we simply have to choose some $\epsilon $ satisfying $$ \begin{align*} 2 \epsilon ^{2} < \epsilon \iff \epsilon < \frac{1}{2} .\end{align*} $$ First of all, is my approach correct? Secondly, are there more elegant solutions (this question was in the context of an analysis course, meaning I didn't think about any elegant linear algebra solutions).

3

There are 3 best solutions below

3
On BEST ANSWER

Your approach is not correct, for the simple reason that you haven't showed $\Phi$ is a Banach contraction. You have provided upper bounds for both $\|X - Y\|_\infty$ and $\|\Phi(X) - \Phi(Y)\|_\infty$ in terms of $\varepsilon$, but you have not related them. Using triangle inequality to turn a difference to a sum is a big red flag in this regard!

As for an alternate, more linear algebra-flavoured approach, we can use the following results:

  • Suppose $\| \cdot \|$ is a matrix norm and $\|B - I\| < 1$. Then $B$ is invertible.
  • If $B$ is invertible, then $B$ has a complex square root.

The former is a classical result; the inverse of $B$ can be expressed as a geometric series of matrices: $$B^{-1} = \sum_{n=0}^\infty (I - B)^n.$$ It can also be proven using this result; because $\|B - I\| < 1$, we know the largest eigenvalue of $B - I$ is less than $1$. This means $1$ is not an eigenvalue, so $B = B - I + I$ is invertible!

The latter can be proven with Jordan normal forms. Each Jordan block takes the form $\lambda I + N$, where $\lambda$ is the eigenvalue for the Jordan block, and $N$ is a nilpotent matrix (i.e. the $1$s along the superdiagonal). One can then substitute $\frac{1}{\lambda}N$ into the Taylor series for $\sqrt{1 + x}$ (note: as $N$ is nilpotent, only finitely many terms are non-zero) to get a square root of $\frac{1}{\lambda}(\lambda I + N)$. Multiply this square root by $\sqrt{\lambda}$, and you get the square root of the Jordan block. Assembling these square roots in block-diagonal form gives a square root of the JNF of the matrix, which will be similar to a square root of the original matrix.

So, using these facts, we can show that, if $\varepsilon < \frac{1}{8}$, then $X^2 + X = A$ has a solution. To see this, complete the square: $$\left(X + \frac{1}{2}I\right)^2 = A + \frac{1}{4}I = \frac{1}{4}(4A + I).$$ If $|(A)_{ij}| < \varepsilon$, then $\|A\|_\infty < 2\varepsilon < \frac{1}{4}$. Thus, $\|4A\|_\infty < 1$, so by the first result, $4A + I$ must be invertible. By the second result, $4A + I$, and hence $A + \frac{1}{4}I$, must have a square root. Let $B$ be a square root of the latter. This gives us a solution: $$X = B - \frac{1}{2} I.$$

0
On

Lemma: There exists $\epsilon>0$ such that if $B$ is a $2\times 2$ matrix with $\|B-\frac{1}{4}I \|_\infty<\epsilon$, then the equation $Y^2=B$ has a solution $Y$.

Proof: Since the eigenvalues of $S$ are given by $$\lambda_{1,2}(S)=\frac{1}{2} \left(S_{11}+S_{22}\pm \sqrt{(S_{11}+S_{22})^2-4(S_{11}S_{22}-S_{12}S_{21})} \right ),$$ and $\frac{1}{4}I$ has positive eigenvalues, there exists a neighborhood of $\frac{1}{4}I$ such that every matrix in that neighborhood whose eigenvalues are real must have positive eigenvalues as well. In other words, there exists $\epsilon>0$ such that for every matrix $B$ with $\|B-\frac{1}{4}I\|_\infty<\epsilon$, either $B$ has non-real eigenvalues or $B$ has real positive eigenvalues. Next, by Jordan decomposition, there exists a matrix $P$ such that $P^{-1}BP=\begin{bmatrix} \lambda_1 & 0 \\ 0 & \lambda_2\end{bmatrix}$ with $\lambda_1,\lambda_2>0$ (if $B$ has two positive eigenvalues) or $P^{-1}BP=t\begin{bmatrix} \cos \alpha & -\sin \alpha \\ \sin \alpha & \cos \alpha \end{bmatrix}$ for some $t>0$ and angle $\alpha$ (if $B$ has non-real eigenvalues). In either case, it is easy to see that there exists $Z$ with $Z^2=P^{-1}BP$. To be precise, in the former case, let $Z=\begin{bmatrix} \sqrt{\lambda_1} & 0 \\ 0 & \sqrt{\lambda_2}\end{bmatrix}$ and in the latter case, let $Z=\sqrt{t}\begin{bmatrix} \cos (\alpha/2) & -\sin (\alpha/2) \\ \sin (\alpha/2) & \cos (\alpha/2) \end{bmatrix}$. We let $Y=PZP^{-1}$ to get $Y^2=B$.

Using this lemma, now let $B=A+\frac{1}{4}I$. Hence there exists $Y$ with $Y^2=B$. Finally, let $X=Y-\frac{1}{2}I$ to get $X^2+X=A$.

3
On

I have a short nonconstructive proof that works in $n$ dimensions.

Let $F:\mathcal{B}\times\mathcal{B}\mapsto\mathcal{B}$ be defined as $F(A,X) = X^2+X-A$ where $\mathcal{B}$ is the Banach space of matrices in $\mathbb{R}^{n\times n}$ with any matrix norm $\|\cdot\|$ (which are all equivalent as we are in finite dimensions); e.g. the induced $\infty$-norm.

Note further that $F(0,0) = 0$, so $(0,0)$ is obviously a solution to the equation. Now, this map is continuously Fréchet differentiable and its Fréchet derivative at $(A,X)$ is given by

$$L(A,X)[H_1,H_2]=H_2X+XH_2+H_2-H_1,$$

where $H_1,H_2\in\mathbb{R}^{n\times n}$ and we have that the map $H_2\mapsto L(0,0)[0,H_2]=H_2$ is an isomorphism from $\mathcal{B}$ to itself as it is nothing else but the identity map.

Therefore, by virtue of the implicit function theorem, there exist neighborhoods $U\subset\mathbb{R}^{n\times n}$ of $0$ and $V\subset\mathbb{R}^{n\times n}$ of $0$, and a Fréchet differentiable function $G : U\mapsto V$ such that $F(A, G(A)) = 0$ and $F(A, X) = 0$ if and only if $X = G(A)$, for all $\displaystyle (A,X)\in U\times V$.

Then, the conclusion follows that one can find a small enough $\epsilon>0$, such that

$$\{A\in\mathbb{R}^{n\times n}: |A_{ij}|<\epsilon\}\subset U.$$

In some sense, this result is expected since in a sufficiently small neighborhood of the origin, the equation behaves like a linear equation and so the local existence of solutions is guaranteed. This easily generalizes to more complex expressions of the form

$$F(A,X)=\sum_{i=1}^m\alpha_iX^i-A,$$

where $m\in\mathbb{N}$, $\alpha_i\in\mathbb{R}$, $i=1,\ldots,m$ and $\alpha_1\ne0$.