Implicit function theorem conclusion notation?

407 Views Asked by At

I am working through implicit function theorem for the first time, and I have the following understanding. Given a system of $n$ equations, \begin{equation} f_i(x_1,\dots ,x_m,y_1,\dots , y_n)=0,\ \ \ \ \ \ \ i=1,\dots ,n \end{equation} With some point $p_0$ with coordinates $(a_1,\dots ,a_m,b_1,\dots ,b_n)$, and the condition, \begin{equation} \frac{\partial (f_1,\dots,f_n)}{\partial (y_1,\dots ,y_n)}\bigg|_{p_{0}}\neq 0 \end{equation} Then the system of equations can be solved for $y_1,\dots ,y_n$ as functions of $x_1,\dots ,x_m$ in the neighbourhood of the point. In this case the following equations hold, \begin{equation} f_i(x_1,\dots ,x_m,y_1(x_1,\dots ,x_m),\dots ,y_n(x_1,\dots ,x_m))=0,\ \ \ \ \ i=1,\dots ,n \end{equation} However I do not understand the following notation, which is the conclusion of the theorem! \begin{equation} \frac{\partial y_i}{\partial x_j}\bigg|_{i\neq j}=-\frac{\frac{\partial(f_1,f_2,\dots ,f_n)}{\partial (y_1,\dots ,x_j,\dots ,y_n)}}{\frac{\partial (f_1,f_2,\dots ,f_n)}{\partial (y_1,\dots ,y_i,\dots ,y_n)}} \end{equation} Could you please explain what these Jacobian determinants are, and why the $x_j$ and $y_i$ are in the denominators of each expression? Many thanks!

1

There are 1 best solutions below

0
On BEST ANSWER

I happen to have some notes on this, perhaps they help:

Given $n$-equations in $(m+n)$-unknowns when can we solve for the last $n$-variables as functions of the first $m$-variables ? Given a continuously differentiable mapping $G=(G_1,G_2,\dots , G_n): \mathbb{R}^m \times \mathbb{R}^n \rightarrow \mathbb{R}^n$ study the level set: (here $k_1,k_2,\dots , k_n$ are constants) \begin{align} \notag G_1(x_1, \dots , x_m, y_1, \dots , y_n)&=k_1 \\ \notag G_2(x_1, \dots , x_m, y_1, \dots , y_n)&=k_2 \\ \notag & \vdots \\ \notag G_n(x_1, \dots , x_m, y_1, \dots , y_n)&=k_n \notag \end{align} We wish to locally solve for $y_1, \dots , y_n$ as functions of $x_1, \dots x_m$. That is, find a mapping $h : \mathbb{R}^m \rightarrow \mathbb{R}^n$ such that $G(x,y)=k$ iff $y=h(x)$ near some point $(a,b) \in \mathbb{R}^m \times \mathbb{R}^n$ such that $G(a,b)=k$. In this section we use the notation $x=(x_1,x_2,\dots x_m)$ and $y=(y_1,y_2,\dots , y_n)$.

Before we turn to the general problem let's analyze the unit-circle problem in this notation. We are given $G(x,y)=x^2+y^2$ and we wish to find $f(x)$ such that $y=f(x)$ solves $G(x,y)=1$. Differentiate with respect to $x$ and use the chain-rule: $$ \frac{\partial G}{\partial x}\frac{dx}{dx} + \frac{\partial G}{\partial y}\frac{dy}{dx} = 0 $$ We find that $\boxed{dy/dx = -G_x/G_y} = -x/y$. Given this analysis we should suspect that if we are given some level curve $G(x,y)=k$ then we may be able to solve for $y$ as a function of $x$ near $p$ if $G(p)=k$ and $G_y(p) \neq 0$. This suspicion is valid and it is one of the many consequences of the implicit function theorem.

We again turn to the linearization approximation. Suppose $G(x,y)=k$ where $x \in \mathbb{R}^m$ and $y \in \mathbb{R}^n$ and suppose $G: \mathbb{R}^m \times \mathbb{R}^n \rightarrow \mathbb{R}^n$ is continuously differentiable. Suppose $(a,b) \in \mathbb{R}^m \times \mathbb{R}^n$ has $G(a,b)=k$. Replace $G$ with its linearization based at $(a,b)$: $$ G(x,y) \approx k + G'(a,b)(x-a,y-b) $$ here we have the matrix multiplication of the $n \times (m+n)$ matrix $G'(a,b)$ with the $(m+n) \times 1$ column vector $(x-a,y-b)$ to yield an $n$-component column vector. It is convenient to define partial derivatives with respect to a whole vector of variables, $$ \frac{\partial G}{\partial x} = \left[ \begin{array}{ccc} \tfrac{\partial G_1}{\partial x_1} & \cdots & \tfrac{\partial G_1}{\partial x_m} \\ \vdots & & \vdots \\ \tfrac{\partial G_n}{\partial x_1} & \cdots & \tfrac{\partial G_n}{\partial x_m} \end{array} \right] \qquad \frac{\partial G}{\partial y} = \left[ \begin{array}{ccc} \tfrac{\partial G_1}{\partial y_1} & \cdots & \tfrac{\partial G_1}{\partial y_n} \\ \vdots & & \vdots \\ \tfrac{\partial G_n}{\partial y_1} & \cdots & \tfrac{\partial G_n}{\partial y_n} \end{array} \right] $$ In this notation we can write the $n \times (m+n)$ matrix $G'(a,b)$ as the concatenation of the $n \times m$ matrix $\frac{\partial G}{\partial x}(a,b) $ and the $n \times n$ matrix $\frac{\partial G}{\partial y}(a,b)$ $$ G'(a,b) = \biggl[\frac{\partial G}{\partial x}(a,b) \bigg{|} \frac{\partial G}{\partial y}(a,b) \biggl] $$ Therefore, for points close to $(a,b)$ we have: $$ G(x,y) \approx k + \frac{\partial G}{\partial x}(a,b)(x-a)+\frac{\partial G}{\partial y}(a,b)(y-b) $$ The nonlinear problem $G(x,y)=k$ has been (locally) replaced by the linear problem of solving what follows for $y$: $$ k \approx k + \frac{\partial G}{\partial x}(a,b)(x-a)+\frac{\partial G}{\partial y}(a,b)(y-b) $$ Suppose the square matrix $\frac{\partial G}{\partial y}(a,b)$ is invertible at $(a,b)$ then we find the following approximation for the implicit solution of $G(x,y)=k$ for $y$ as a function of $x$: $$ y = b - \biggl[\frac{\partial G}{\partial y}(a,b) \biggr]^{-1}\biggl[\frac{\partial G}{\partial x}(a,b)(x-a) \biggl]. $$ Of course this is not a formal proof, but it does suggest that $det\bigl[\frac{\partial G}{\partial y}(a,b) \bigr] \neq 0$ is a necessary condition for solving for the $y$ variables.

As before suppose $G: \mathbb{R}^m \times \mathbb{R}^n \rightarrow \mathbb{R}^n$. Suppose we have a continuously differentiable function $h: \mathbb{R}^m \rightarrow \mathbb{R}^n$ such that $h(a)=b$ and $G(x,h(x))=k$. We seek to find the derivative of $h$ in terms of the derivative of $G$. This is a generalization of the implicit differentiation calculation we perform in calculus I. I'm including this to help you understand the notation a bit more before I state the implicit function theorem. Differentiate with respect to $x_l$ for $l \in \{1,2,\dots n\}$: $$ \frac{\partial}{\partial x_{l}} \biggl[ G(x,h(x)) \biggr] = \sum_{i=1}^{m}\frac{\partial G}{\partial x_i } \frac{\partial x_i}{\partial x_l } + \sum_{j=1}^{n}\frac{\partial G}{\partial y_j}\frac{\partial h_j}{\partial x_l} = \frac{\partial G}{\partial x_l } + \sum_{j=1}^{n}\frac{\partial G}{\partial y_j}\frac{\partial h_j}{\partial x_l} = 0$$ we made use of the identity $\frac{\partial x_i}{\partial x_k } = \delta_{ik}$ to squash the sum of $i$ to the single nontrivial term and the zero on the r.h.s follows from the fact that $\frac{\partial}{\partial x_l} (k)=0$. Concatenate these derivatives from $k=1$ up to $k=m$: $$ \biggl[ \frac{\partial G}{\partial x_1 } + \sum_{j=1}^{n}\frac{\partial G}{\partial y_j}\frac{\partial h_j}{\partial x_1} \bigg{|} \frac{\partial G}{\partial x_2 } + \sum_{j=1}^{n}\frac{\partial G}{\partial y_j}\frac{\partial h_j}{\partial x_2} \bigg{|} \cdots \bigg{|} \frac{\partial G}{\partial x_m } + \sum_{j=1}^{n}\frac{\partial G}{\partial y_j}\frac{\partial h_j}{\partial x_m} \biggr] = [0|0| \cdots |0] $$ Properties of matrix addition allow us to parse the expression above as follows: $$ \biggl[ \frac{\partial G}{\partial x_1 } \bigg{|} \frac{\partial G}{\partial x_2 } \bigg{|} \cdots \bigg{|} \frac{\partial G}{\partial x_m } \biggr] + \biggl[ \sum_{j=1}^{n}\frac{\partial G}{\partial y_j}\frac{\partial h_j}{\partial x_1} \bigg{|} \sum_{j=1}^{n}\frac{\partial G}{\partial y_j}\frac{\partial h_j}{\partial x_2} \bigg{|} \cdots \bigg{|} \sum_{j=1}^{n}\frac{\partial G}{\partial y_j}\frac{\partial h_j}{\partial x_m} \biggr] = [0|0| \cdots |0] $$ But, this reduces to $$ \frac{\partial G}{\partial x } + \biggl[ \frac{\partial G}{\partial y}\frac{\partial h}{\partial x_1} \bigg{|} \frac{\partial G}{\partial y}\frac{\partial h}{\partial x_2} \bigg{|} \cdots \bigg{|} \frac{\partial G}{\partial y}\frac{\partial h}{\partial x_m} \biggr] = 0 \in \mathbb{R}^{m \times n} $$ The concatenation property of matrix multiplication states $[Ab_1|Ab_2| \cdots | Ab_m] = A[b_1|b_2| \cdots | b_m]$ we use this to write the expression once more, $$ \frac{\partial G}{\partial x } + \frac{\partial G}{\partial y} \biggl[ \frac{\partial h}{\partial x_1} \bigg{|} \frac{\partial h}{\partial x_2} \bigg{|} \cdots \bigg{|} \frac{\partial h}{\partial x_m} \biggr] = 0 \ \ \Rightarrow \ \ \frac{\partial G}{\partial x } + \frac{\partial G}{\partial y} \frac{\partial h}{\partial x} = 0 \ \ \Rightarrow \ \ \boxed{\frac{\partial h}{\partial x} = -\frac{\partial G}{\partial y}^{-1}\frac{\partial G}{\partial x }} $$ where in the last implication we made use of the assumption that $\frac{\partial G}{\partial y}$ is invertible.