(Why) Does $f(x,y)=\{x,G(x,y)\}$ having an inverse imply $D_{1}f(x,y)=0$?

73 Views Asked by At

My questions are:

If the $\mathscr{C}^{1}$ mapping $f:\mathbb{R}^{m+n}\rightarrow\mathbb{R}^{m+n}$ defined by $f(\mathbf{x},\mathbf{y})=\{\mathbf{x},G(\mathbf{x},\mathbf{y})\}$ has an inverse such that $f^{-1}(\mathbf{x},G(\mathbf{x},\mathbf{y}))=\{\mathbf{x},\mathbf{y}\}$ does that imply

$\mathit{D}_{1}G(\mathbf{x},\mathbf{y})=\mathit{D}_{1}G(\mathbf{x},\mathbf{y})^{-1}=\{0\}_{n\times m}$?

Assuming this to be the case, how can that be explained geometrically?

The reader can skip down to the example at the bottom to find my demonstration of this proposition for the case $f:\mathbb{R}^{2+1}\rightarrow\mathbb{R}^{2+1}$.

The following regards the proof of the implicit mapping theorem in C.H. Edwards, Jr.'s Advanced Calculus of Several Variables. I refer to pages 190 through 192.

Let the mapping $G:\mathbb{R}^{m+n}\rightarrow\mathbb{R}^{n}$ be $\mathscr{C}^{1}$ in a neighborhood of the point $(\mathbf{a},\mathbf{b})$ where $G(\mathbf{a},\mathbf{b})=\mathbf{0}$. If the partial derivative matrix $\mathit{D}_{2}G(\mathbf{a},\mathbf{b})$ is non-singular then there exists a neighborhood $\mathit{U}$ of $\mathbf{a}$ in $\mathbb{R}^{m}$, a neighborhood $\mathit{W}$ of $(\mathbf{a},\mathbf{b})$ in $\mathbb{R}^{m+n}$, and a $\mathscr{C}^{1}$ mapping $h:\mathit{U}\rightarrow\mathbb{R}^{n}$, such that $\mathbf{y}=h(\mathbf{x})$ solves the equation $G(\mathbf{x},\mathbf{y})=\mathbf{0}$ in $\mathit{W}$.

In particular, the implicitly defined mapping $h$ is the limit of the sequence of successive approximations defined inductively by

$h_{0}(\mathbf{x})=\mathbf{b}$,

$h_{k+1}(\mathbf{x})=h_{k}(\mathbf{x})-\mathit{D}_{2}G(\mathbf{a},\mathbf{b})^{-1}G(\mathbf{x},h_{k}(\mathbf{x}))$''

In the proof the mapping $f(\mathbf{x},\mathbf{y})=\{\mathbf{x},G(\mathbf{x},\mathbf{y})\}$ is introduced. Its derivative is

$f^{\prime}(\mathbf{x},\mathbf{y})=\left[\begin{array}{cc} \mathbf{I}_{m\times m} & \{\mathbf{0}\}_{m\times n}\\ \mathit{D}_{1}G(\mathbf{a},\mathbf{b}) & \mathit{D}_{2}G(\mathbf{a},\mathbf{b}) \end{array}\right]$.

After considerable demonstration, Edwards arrives at the inductive hypothesis

$\begin{bmatrix}\mathbf{x}\\ h_{k+1}(\mathbf{x}) \end{bmatrix}=\begin{bmatrix}\mathbf{x}\\ h_{k}(\mathbf{x}) \end{bmatrix}-\left[\begin{array}{cc} \mathbf{I}_{m\times m} & \{\mathbf{0}\}_{m\times n}\\ \mathit{D}_{1}G(\mathbf{a},\mathbf{b}) & \mathit{D}_{2}G(\mathbf{a},\mathbf{b}) \end{array}\right]^{-1}\begin{bmatrix}\mathbf{0}\\ G(\mathbf{x},h_{k}(\mathbf{x})) \end{bmatrix}$.

Then

Taking the second components of this equation, we obtain

$h_{k+1}(\mathbf{x})=h_{k}(\mathbf{x})-\mathit{D}_{2}G(\mathbf{a},\mathbf{b})^{-1}G(\mathbf{x},h_{k}(\mathbf{x}))$

as desired.''

At which point I should be very happy to be that much closer to returning to physics. I observe, however, that writing out the steps indicated goes as follows:

Invert the matrix

$\left[\begin{array}{cc} \mathbf{I}_{m\times m} & \{\mathbf{0}\}_{m\times n}\\ \mathit{D}_{1}G(\mathbf{a},\mathbf{b}) & \mathit{D}_{2}G(\mathbf{a},\mathbf{b}) \end{array}\right]^{-1}=\left[\begin{array}{cc} \mathbf{I}_{m\times m} & \mathit{D}_{1}G(\mathbf{a},\mathbf{b})^{-1}\\ \{\mathbf{0}\}_{n\times m} & \mathit{D}_{2}G(\mathbf{a},\mathbf{b})^{-1} \end{array}\right]$.

Perform the indicated multiplication in the second term of the right-hand side

$\left[\begin{array}{cc} \mathbf{I}_{m\times m} & \mathit{D}_{1}G(\mathbf{a},\mathbf{b})^{-1}\\ \{\mathbf{0}\}_{n\times m} & \mathit{D}_{2}G(\mathbf{a},\mathbf{b})^{-1} \end{array}\right]\begin{bmatrix}\mathbf{0}\\ G(\mathbf{x},h_{k}(\mathbf{x})) \end{bmatrix}=\begin{bmatrix}\mathit{D}_{1}G(\mathbf{a},\mathbf{b})^{-1}G(\mathbf{x},h_{k}(\mathbf{x}))\\ \mathit{D}_{2}G(\mathbf{a},\mathbf{b})^{-1}G(\mathbf{x},h_{k}(\mathbf{x})) \end{bmatrix}$.

Write the equation in the resulting form

$\begin{bmatrix}\mathbf{x}\\ h_{k+1}(\mathbf{x}) \end{bmatrix}=\begin{bmatrix}\mathbf{x}-\mathit{D}_{1}G(\mathbf{a},\mathbf{b})^{-1}G(\mathbf{x},h_{k}(\mathbf{x}))\\ h_{k}(\mathbf{x})-\mathit{D}_{2}G(\mathbf{a},\mathbf{b})^{-1}G(\mathbf{x},h_{k}(\mathbf{x})) \end{bmatrix}.$

But that can only be true if

$\mathit{D}_{1}G(\mathbf{a},\mathbf{b})^{-1}G(\mathbf{x},h_{k}(\mathbf{x}))=\mathbf{0}$.

Clearly $G(\mathbf{x},h_{k}(\mathbf{x}))\ne\mathbf{0},$in general.

The following example can be extended to multi-variable/multi-value equations.

$\mathbf{v}=\{x,y\},G[\mathbf{v}]:\mathbb{R}^{2}\rightarrow\mathbb{R}$,$\mathbf{f}[\mathbf{v}]=\{x,G[\mathbf{v}]\}$,$\mathbf{g}\equiv\mathbf{f}^{-1}\implies\mathbf{g}[x,G[\mathbf{v}]]=\mathbf{v}$

$\frac{d\mathbf{g}[\mathbf{f}[\mathbf{v}]]}{d\mathbf{v}}=\begin{bmatrix}1 & 0\\ 0 & 1 \end{bmatrix}=\begin{bmatrix}\frac{\partial g^{1}}{\partial v^{1}} & \frac{\partial g^{1}}{\partial v^{2}}\\ \frac{\partial g^{2}}{\partial v^{1}} & \frac{\partial g^{2}}{\partial v^{2}} \end{bmatrix}=\begin{bmatrix}\frac{\partial x[\mathbf{f}[\mathbf{v}]]}{\partial x} & \frac{\partial x[\mathbf{f}[\mathbf{v}]]}{\partial y}\\ \frac{\partial y[\mathbf{f}[\mathbf{v}]]}{\partial x} & \frac{\partial y[\mathbf{f}[\mathbf{v}]]}{\partial y} \end{bmatrix}$

$\frac{\partial g^{i}[\mathbf{f}[\mathbf{v}]]}{\partial v^{j}}=\left\{ \frac{\partial g^{i}}{\partial f^{b}}\frac{\partial f^{b}}{\partial v^{j}}\right\} _{2\times2}=\left\{ \frac{\partial g^{i}}{\partial f^{1}}\frac{\partial f^{1}}{\partial v^{j}}+\frac{\partial g^{i}}{\partial f^{2}}\frac{\partial f^{2}}{\partial v^{j}}\right\} $

$=\left\{ \frac{\partial g^{i}}{\partial x}\frac{\partial x}{\partial v^{j}}+\frac{\partial g^{i}}{\partial G}\frac{\partial G}{\partial v^{j}}\right\} $

$\frac{\partial y}{\partial x}\frac{\partial x}{\partial x}+\frac{\partial y}{\partial G}\frac{\partial G}{\partial x}=\frac{\partial x}{\partial x}\frac{\partial x}{\partial y}+\frac{\partial x}{\partial G}\frac{\partial G}{\partial y}=0$

$\frac{\partial y}{\partial G}\frac{\partial G}{\partial x}=\frac{\partial x}{\partial G}\frac{\partial G}{\partial y}=0$

$\frac{\partial x}{\partial x}\frac{\partial x}{\partial x}+\frac{\partial x}{\partial G}\frac{\partial G}{\partial x}=\frac{\partial y}{\partial x}\frac{\partial x}{\partial y}+\frac{\partial y}{\partial G}\frac{\partial G}{\partial y}=1$

$1+\frac{\partial x}{\partial G}\frac{\partial G}{\partial x}=0+\frac{\partial y}{\partial G}\frac{\partial G}{\partial y}=1$

$\frac{\partial x}{\partial G}\frac{\partial G}{\partial x}=0$

$\frac{\partial y}{\partial G}\frac{\partial G}{\partial y}=1$

$\frac{\partial G}{\partial x}=\frac{\partial x}{\partial G}=0$

2

There are 2 best solutions below

0
On BEST ANSWER

This HAD to work!

Here is an alternative way of finding the inverse. It is a corrected version of my first attempt. I am still a bit shaky regarding the details. I ``forced the answer'' in some places. But I believe it is correct.

The same letter is used to denote more than one mapping. The mappings are distinguished by their argument lists. Boldface letters denote variables. Letters with arrows over them denote mappings. The index ranges are intended to be apparent by context.

Assert the following:

$\mathbf{x}\in\mathbb{R}^{m}$; $\mathbf{y}\in\mathbb{R}^{n}$; $\mathbf{v}=\{\mathbf{x},\mathbf{y}\}$; $\vec{y}[\mathbf{x}]:\mathbb{R}^{m}\rightarrow\mathbb{R}^{n}$; $\mathbf{v}\in\mathbb{R}^{m+n}$; $\vec{G}[\mathbf{v}]:\mathbb{R}^{m+n}\rightarrow\mathbb{R}^{n}$; $\vec{f}[\mathbf{v}]:\mathbb{R}^{m+n}\rightarrow\mathbb{R}^{m+n}$;

$\vec{v}[\mathbf{x}]=\{\mathbf{x},\vec{y}[\mathbf{x}]\}$;

$\vec{f}[\mathbf{v}]=\{v^{1},\dots,v^{m},G^{1}[\mathbf{v}],\dots,G^{n}[\mathbf{v}]\}=\{\mathbf{x},\vec{G}[\mathbf{v}]\}$;

$\vec{v}[\mathbf{f}]=\{\mathbf{x},\mathbf{y}\}$;

$\vec{G}[\vec{v}[\mathbf{x}]]=\vec{0}$.

Then

$\vec{f}^{\prime}[\mathbf{v}]=\begin{bmatrix}\left\{ \frac{\partial f^{i}}{\partial v^{j}}\right\} _{m\times m} & \left\{ \frac{\partial f^{i}}{\partial v^{j}}\right\} _{m\times n}\\ \left\{ \frac{\partial f^{i}}{\partial v^{j}}\right\} _{n\times m} & \left\{ \frac{\partial f^{i}}{\partial v^{j}}\right\} _{n\times n} \end{bmatrix}=\begin{bmatrix}\left\{ \frac{\partial x^{i}}{\partial x^{j}}\right\} _{m\times m} & \left\{ \frac{\partial x^{i}}{\partial y^{j}}\right\} _{m\times n}\\ \left\{ \frac{\partial G^{i}[\mathbf{v}]}{\partial x^{j}}\right\} _{n\times m} & \left\{ \frac{\partial G^{i}[\mathbf{v}]}{\partial y^{j}}\right\} _{n\times n} \end{bmatrix}$

and

$\vec{v}^{\prime}[\mathbf{f}]=\left\{ \frac{\partial\vec{v}[\mathbf{f}]}{\partial\mathbf{x}},\frac{\partial\vec{v}[\mathbf{f}]}{\partial\mathbf{G}}\right\} $

$=\left\{ \left\{ \frac{d\mathbf{x}}{d\mathbf{x}},\frac{\partial\vec{y}[\mathbf{f}]}{\partial\mathbf{x}}\right\} ,\left\{ \frac{d\mathbf{x}}{d\mathbf{G}},\frac{\partial\vec{y}[\mathbf{f}]}{\partial\mathbf{G}}\right\} \right\} $

$=\left\{ \left\{ \mathbf{I},\frac{\partial\vec{y}[\{\mathbf{x},\mathbf{G}\}]}{\partial\mathbf{x}}\right\} ,\left\{ \mathbf{0},\frac{\partial\vec{y}[\{\mathbf{x},\mathbf{G}\}]}{\partial\mathbf{G}}\right\} \right\} $

$=\left\{ \left\{ \mathbf{I},\frac{d\vec{y}[\mathbf{x}]}{d\mathbf{x}}\right\} ,\left\{ \mathbf{0},\frac{d\vec{y}[\mathbf{G}]}{d\mathbf{G}}\right\} \right\} $

$=\begin{bmatrix}\left\{ \frac{\partial v^{i}[\mathbf{f}]}{\partial f^{j}}\right\} _{m\times m} & \left\{ \frac{\partial v^{i}[\mathbf{f}]}{\partial f^{j}}\right\} _{m\times n}\\ \left\{ \frac{\partial v^{i}[\mathbf{f}]}{\partial f^{j}}\right\} _{n\times m} & \left\{ \frac{\partial v^{i}[\mathbf{f}]}{\partial f^{j}}\right\} _{n\times n} \end{bmatrix}=\begin{bmatrix}\mathbf{I}_{m\times m} & \mathbf{0}_{m\times n}\\ \frac{d\vec{y}[\mathbf{x}]}{d\mathbf{x}} & \frac{d\vec{y}[\mathbf{G}]}{d\mathbf{G}} \end{bmatrix}$.

Using the condition $\vec{G}[\vec{v}[\mathbf{x}]]=\vec{0}$, obtain the following result:

$\frac{d\vec{G}[\vec{v}[\mathbf{x}]]}{d\mathbf{x}}=\frac{\partial\vec{G}[\mathbf{v}]}{\partial\mathbf{x}}+\frac{\partial\vec{G}[\mathbf{v}]}{\partial\mathbf{y}}\frac{d\vec{y}[\mathbf{x}]}{d\mathbf{x}}=\vec{0}$

$-\frac{\partial\vec{G}[\mathbf{v}]}{\partial\mathbf{x}}=\frac{\partial\vec{G}[\mathbf{v}]}{\partial\mathbf{y}}\frac{d\vec{y}[\mathbf{x}]}{d\mathbf{x}}$

$-\left[\frac{\partial\vec{G}[\mathbf{v}]}{\partial\mathbf{y}}\right]^{-1}\frac{\partial\vec{G}[\mathbf{v}]}{\partial\mathbf{x}}=\frac{d\vec{y}[\mathbf{x}]}{d\mathbf{x}}$

$-\frac{\partial\vec{y}[\mathbf{G}]}{\partial\mathbf{G}}\frac{\partial\vec{G}[\mathbf{v}]}{\partial\mathbf{x}}=\frac{d\vec{y}[\mathbf{x}]}{d\mathbf{x}}$.

Replace the corresponding sub-matrix in the previous result

$\vec{v}^{\prime}[\mathbf{f}]=\begin{bmatrix}\mathbf{I} & \mathbf{0}\\ -\frac{\partial\vec{y}[\mathbf{G}]}{\partial\mathbf{G}}\frac{\partial\vec{G}[\mathbf{v}]}{\partial\mathbf{x}} & \frac{d\vec{y}[\mathbf{G}]}{d\mathbf{G}} \end{bmatrix}$.

Using the notation of the original post (following Edwards):

$f^{\prime}(\mathbf{x},\mathbf{y})^{-1}=\left[\begin{array}{cc} \mathbf{I} & \mathbf{0}\\ \mathit{-D}_{2}G(\mathbf{a},\mathbf{b})^{-1}\mathit{D}_{1}G(\mathbf{a},\mathbf{b}) & \mathit{D}_{2}G(\mathbf{a},\mathbf{b})^{-1} \end{array}\right]$.

1
On

I inverted the matrices incorrectly. I tried a "trick" which is often used in physics, but fails in this situation. It is for this very reason that I am studying Edwards's book. I know I don't understand the subtleties of the mathematics.

Here's what I now believe is the correct way to invert the matrix in the proof. I start with a generic matrix of the same form and exploit some simplifying features to arrive at expressions for the components of the inverse.

$\left[\begin{array}{cc} \mathbf{I}_{m\times m} & \mathbf{0}_{m\times n}\\ \mathbf{A}_{n\times m} & \mathbf{B}_{n\times n} \end{array}\right]\left[\begin{array}{cc} \mathbf{I}_{m\times m} & \mathbf{0}_{n\times m}\\ \mathbf{C}_{n\times m} & \mathbf{D}_{n\times n} \end{array}\right]=\left[\begin{array}{cc} \mathbf{I} & \mathbf{0}\\ \mathbf{A}+\mathbf{B}\mathbf{C} & \mathbf{B}\mathbf{D} \end{array}\right]=\left[\begin{array}{cc} \mathbf{I} & \mathbf{0}\\ \mathbf{0} & \mathbf{I} \end{array}\right]$

$\mathbf{A}=-\mathbf{B}\mathbf{C}$

$\mathbf{B}^{-1}\mathbf{A}=-\mathbf{B}^{-1}\mathbf{B}\mathbf{C}=-\mathbf{C}$

$\mathbf{D}=\mathbf{B}^{-1}$

$\left[\begin{array}{cc} \mathbf{I} & \mathbf{0}\\ \mathbf{A} & \mathbf{B} \end{array}\right]^{-1}=\left[\begin{array}{cc} \mathbf{I} & \mathbf{0}\\ \mathbf{C} & \mathbf{D} \end{array}\right]=\left[\begin{array}{cc} \mathbf{I} & \mathbf{0}\\ -\mathbf{B}^{-1}\mathbf{A} & \mathbf{B}^{-1} \end{array}\right]$

Make the obvious substitutions and get.

$\left[\begin{array}{cc} \mathbf{I}_{m\times m} & \mathbf{0}{}_{m\times n}\\ \mathit{D}_{1}G(\mathbf{a},\mathbf{b}) & \mathit{D}_{2}G(\mathbf{a},\mathbf{b}) \end{array}\right]^{-1}=\left[\begin{array}{cc} \mathbf{I}_{m\times m} & \mathbf{0}{}_{m\times n}\\ -(\mathit{D}_{2}G(\mathbf{a},\mathbf{b})^{-1}\mathit{D}_{1}G(\mathbf{a},\mathbf{b})) & \mathit{D}_{2}G(\mathbf{a},\mathbf{b})^{-1} \end{array}\right]$

$\begin{bmatrix}\mathbf{x}\\ h_{k+1}(\mathbf{x}) \end{bmatrix}=\begin{bmatrix}\mathbf{x}\\ h_{k}(\mathbf{x}) \end{bmatrix}-\left[\begin{array}{cc} \mathbf{I}_{m\times m} & \{\mathbf{0}\}_{m\times n}\\ \mathit{D}_{1}G(\mathbf{a},\mathbf{b}) & \mathit{D}_{2}G(\mathbf{a},\mathbf{b}) \end{array}\right]^{-1}\begin{bmatrix}\mathbf{0}\\ G(\mathbf{x},h_{k}(\mathbf{x})) \end{bmatrix}$

$\begin{bmatrix}\mathbf{x}\\ h_{k+1}(\mathbf{x}) \end{bmatrix}=\begin{bmatrix}\mathbf{x}\\ h_{k}(\mathbf{x}) \end{bmatrix}-\left[\begin{array}{cc} \mathbf{I}_{m\times m} & \mathbf{0}{}_{m\times n}\\ -(\mathit{D}_{2}G(\mathbf{a},\mathbf{b})^{-1}\mathit{D}_{1}G(\mathbf{a},\mathbf{b})) & \mathit{D}_{2}G(\mathbf{a},\mathbf{b})^{-1} \end{array}\right]\begin{bmatrix}\mathbf{0}\\ G(\mathbf{x},h_{k}(\mathbf{x})) \end{bmatrix}$

$\begin{bmatrix}\mathbf{x}\\ h_{k+1}(\mathbf{x}) \end{bmatrix}=\begin{bmatrix}\mathbf{x}\\ h_{k}(\mathbf{x}) \end{bmatrix}-\left[\begin{array}{c} \mathbf{0}{}_{m}\\ \mathit{D}_{2}G(\mathbf{a},\mathbf{b})^{-1}G(\mathbf{x},h_{k}(\mathbf{x}))_{n} \end{array}\right]$