Prove that there is an $n \times n$ matrix $B$ such that $A=B^{2}+B .$

123 Views Asked by At

Given a positive integer $n,$ prove that there is $\varepsilon>0$ such that for every $n \times n$ matrix $A$ with $|A|<\varepsilon$ (Hilbert-Schmidt norm), there is an $n \times n$ matrix $B$ such that $A=B^{2}+B .$ Hint: Differentiate $f(B)=B^{2}+B$

I am not getting any clue here. As given in the hint I can take $f(x)=x^2+x.$ I also think that I have to use Implicit function theorem by taking $g(A, B)=A-B^2-B$. But I am confused on how to use implicit function theorem or inverse function theorem on matrices.

1

There are 1 best solutions below

0
On BEST ANSWER

The inverse and implicit function theorems are equivalent, so you can use either one of them to solve your question; it's a matter of taste (though sometimes one or the other may be simpler for the particular problem at hand)... in this case though, they're about the same amount of work.

Before we can even talk about how to use these theorems, let's recall what exactly they say. As you will see, these theorems are valid in a very general context, and I think seeing them in this generality (and working through some examples carefully) is the clearest approach, so here goes:

Inverse Function Theorem on Banach Spaces

Let $r\geq 1$ be an integer, $E,F$ be Banach spaces, $U\subset E$ open, and $f:U\to F$ a $C^r$ mapping. If $p\in U$ is some point such that $Df_p:E\to F$ is a linear isomorphism then there exist open neighbourhoods $\Omega,\Omega'$ with $p\in \Omega\subset U$ and $f(p)\in \Omega'\subset F$ such that the restriction $f:\Omega\to \Omega'$ is a $C^r$ diffeomorphism (i.e has a $C^r$ inverse).

Next, we state the implicit function theorem

Implicit Function Theorem on Banach Spaces

Let $r\geq 1$ be an integer, $E,F, G$ be Banach spaces, $U\subset E\times F$ open, and $\phi:U\to G$ a $C^r$ mapping. If $(a,b)\in U$ is some point such that the partial derivative with respect to second variable, $\partial_2\phi_{(a,b)}:F\to G$, is a linear isomorphism then there is an open neighbourhood $\Omega_1\times \Omega_2$ of $(a,\phi(a,b))$ in $E\times G$ and a $C^r$ function $g:\Omega_1\times \Omega_2\to F$ such that for all $(x,w)\in \Omega_1\times \Omega_2$, we have $(x,g(x,w))\in U=\text{domain}(\phi)$ and $\phi(x,g(x,w))=w$.

These may sound abstract but the point is this: the inverse function theorem tells us that if the derivative at a point is invertible (is an isomorphism) then locally the function is also invertible. The implicit function theorem (in essence) tells us that if part of a function has invertible derivative then we can actually for some of the variables as functions of the remaining ones; explicitly we can locally solve the equation $\phi(x,y)=w$ for $y$ as a smoothly behaving function of $x$ and $w$ (in most applications we take $w=0$).

The best example to keep in mind for the inverse function theorem is saying that you can solve the equation $y=Ax$ as $x=A^{-1}y$ (if $x,y\in\Bbb{R}^n$ and $A\in M_{n\times n}(\Bbb{R})$ is invertible). For the implicit function theorem, the canonical example to remember is again a linear one: the equation $w=Ax+By$ can be solved for $y$ as $y= B^{-1}(w-Ax)$ (if $w,y\in\Bbb{R}^n, x\in \Bbb{R}^m$, $A\in M_{n\times m}(\Bbb{R})$ and $B\in M_{n\times n}(\Bbb{R})$ is invertible). These two examples are the linear algebraic versions of these calculus theorems, and I found it greatly beneficial to understand this analogy.

You may have mostly encountered these theorems in the context of functions between say $\Bbb{R}^n\to \Bbb{R}^n$ or $\Bbb{R}^n\times \Bbb{R}^m\to \Bbb{R}^m$, but clearly this formulation is not the best way of doing things for your question. Also, before getting to the solution, one final comment I have is that it doesn't really matter what norm you use on the space of matrices; it could be Hilbert Schmidt/ max norm/ operator norm whatever... it doesn't matter because it is a nice theorem that all norms on a finite-dimensional vector space are equivalent (i.e they all induce the same topology).


Solution Using Inverse Function Theorem.

Here, we take $U=E = F = M_{n\times n}(\Bbb{R})$. Consider $f(B):= B^2+B$. Then, $f(0)=0$, and it is even a polynomial in $B$ hence it is $C^{\infty}$ (indeed, the third derivative will vanish identically). The first derivative is calculated as for all $C\in E=M_{n\times n}(\Bbb{R})$, \begin{align} Df_B(C)&= C B + BC + C \end{align} This is a consequence of product (or chain rule)... the "quick way" of doing this calculation is to write $f(B)=B^2+B=B\cdot B +B$, then $df=dB\cdot B + B \cdot dB + dB$; now we interpret this as a pointwise equality of linear transformations (i.e we regard $dB$ as the linear mapping $C\mapsto C$), in which case, this says exactly what I wrote above in the more explicit notation. Since we have $f(0)=0$, it is a good idea to consider $B=0$, doing so tells us $Df_0(C)=C$, i.e $Df_0 = \text{id}_{M_{n\times n}(\Bbb{R})}$, which is clearly an isomorphism.

So, the inverse function theorem now tells us that there exist open neighbourhoods $\Omega$ of $0\in U= E = M_{n\times n}(\Bbb{R})$, and $\Omega'$ of $f(0)=0$ in $F=M_{n\times n}(\Bbb{R})$ such that the restriction $f:\Omega\to \Omega'$ is a $C^{\infty}$ diffeomorphism. So, now, for any $A\in \Omega'$, define $B:= f^{-1}(A)$. Then, $A=f(f^{-1}(A))=f(B)=B^2+B$. i.e we have shown that for every $A\in \Omega'$, there exists (a unique) $B\in \Omega$ such that $A=B^2+B$, which completes the proof.

(and if you really want to connect things back to the $\epsilon$, then just say that since $\Omega'$ is an open neighbourhood of $0$ in $M_{n\times n}(\Bbb{R})$, and all norms are equivalent, there exists $\epsilon>0$ such that the $\epsilon$ ball $|A|<\epsilon$ is contained in $\Omega'$).


Solution Using Implicit Function Theorem.

Here, $E=F=G=M_{n\times n}(\Bbb{R})$ and $U=E\times F$. We consider $\phi(A,B):= B^2+B-A$. Again, this is a polynomial function of its arguments so it is $C^{\infty}$ (the third derivative will vanish identically). A similar calculation shows that the partial derivative with respect to the second variable (i.e wrt $B$) is the linear mapping $\partial_2\phi_{(A,B)}:F\to G$ (i.e $M_{n\times n}(\Bbb{R})\to M_{n\times n}(\Bbb{R})$) such that for all $C\in F=M_{n\times n}(\Bbb{R})$ \begin{align} \partial_2\phi_{(A,B)}(C) &= (CB + BC) + C - 0 \end{align} Since $\phi(0,0)=0$, it is a good idea to take $(a,b)=(0,0)$, in which case we note that $\partial_2\phi_{(0,0)} = \text{id}_{M_{n\times n}(\Bbb{R})}$ is an isomorphism. Now we can use the implicit function theorem to tell us that there is an open neighbourhood $\Omega_1 \times \Omega_2$ of $(a,\phi(a,b)) = (0,0)$ in $E\times G = M_{n\times n}(\Bbb{R})\times M_{n\times n}(\Bbb{R})$ and a $C^{\infty}$ function $g:\Omega_1\times \Omega_2\to F=M_{n\times n}(\Bbb{R})$ such that for all $(A,w)\in \Omega_1\times \Omega_2$, \begin{align} \phi(A,g(A,w)) &= w \end{align} Now, plug in $w=0$, and see what the equation says: $[g(A,0)]^2 + g(A,0) - A = 0$, i.e if you take $B=g(A,0)$, then $A=B^2+B$, so for any $A\in \Omega_1$, we have found a corresponding $B$ (namely $g(A,0)$) which works.