Show that Newton update in two variables converges

122 Views Asked by At

I am having two update rules of a two variable function $f(x, y)$, i.e.,

$$x \mapsto h_1(x,y):=x - \frac{f(x, y)}{\partial_x f(x,y)} \tag{1}$$ $$y \mapsto h_2(x,y):=y - \frac{f(x, y)}{\partial_y f(x,y)} \tag {2}$$

We know that $(1)$ and $(2)$ converge when applied seperately. My goal is to prove that when these rules are applied alternatively they do find a root for $f(x,y)$ and converge to a stationary point.

In [1], it is shown that $(1)$ and $(2)$ are contractions mappings. I am trying to understand if

$$H(x,y) = \begin{bmatrix} h_1(x,y) \\ h_2(x,y) \end{bmatrix} = \begin{bmatrix} x \\ y\end{bmatrix} - \begin{bmatrix} \frac{f(x, y)}{\partial_x f(x,y)} \\ \frac{f(x, y)}{\partial_y f(x,y)} \end{bmatrix}$$ is a contraction mapping. Following the same procedure as in [1] we have

$$\begin{aligned}\|H(x,y) -H(x',y') \| = & \left\| \begin{bmatrix} x \\ y\end{bmatrix} - \begin{bmatrix} \frac{f(x, y)}{\partial_x f(x,y)} \\ \frac{f(x,y)}{\partial_y f(x,y)} \end{bmatrix} - \begin{bmatrix} x' \\ y'\end{bmatrix} + \begin{bmatrix} \frac{f(x', y')}{\partial_x f(x',y')} \\ \frac{f(x', y')}{\partial_y f(x',y')} \end{bmatrix}\right\| \\ = & \left\| \begin{bmatrix} x-x' \\ y-y' \end{bmatrix} - \begin{bmatrix} \frac{f(x, y)}{\partial_x f(x,y)} - \frac{f(x', y')}{\partial_x f(x',y')} \\ \frac{f(x,y)}{\partial_y f(x,y)} - \frac{f(x', y')}{\partial_y f(x',y')} \end{bmatrix} \right\|. \end{aligned}\tag{3}$$

Following the same procedure as in [1], I am trying to apply mean value theorem with two variables but I can not show that

$$\| H(x,y) - H(x',y')\| \leq k~ \| \begin{bmatrix} x-x' \\ y-y' \end{bmatrix} \|\tag{4}$$ for some appropriate $k$. Could you please someone help to show that $H(x, y)$ is (or it is not) a contraction mapping?

EDIT1: If we use the modified Newton method we have

$$x \mapsto c_1(x,y):=x - \frac{f(x, y)}{\partial_x f(x^0,y^0)} \tag{5}$$ $$y \mapsto c_2(x,y):=y - \frac{f(x, y)}{\partial_y f(x^0,y^0)} \tag {6}$$

By mean value theorem we have

$$\| C(x,y) - C(x',y')\| \leq \| \nabla C(x,y)^T \|\| \begin{bmatrix} x-x' \\ y-y' \end{bmatrix} \|\tag{7}$$ where

$$C(x,y) = \begin{bmatrix} c_1(x,y) \\ c_2(x,y) \end{bmatrix} = \begin{bmatrix} x \\ y\end{bmatrix} - \begin{bmatrix} \frac{f(x, y)}{\partial_x f(x^0,y^0)} \\ \frac{f(x, y)}{\partial_y f(x^0,y^0)} \end{bmatrix}\tag{8}$$ Also, I have that $\partial_x f(x, y)$ and $\partial_y f(x, y)$ are decreasing and positive. Thus, there exists $m1,m_2$ and $M_1, M_2$ such that

$$0 < m_1 < \partial_x f(x,y) < \frac{M_1}{2} \tag{9}$$ and $$0 < m_2 < \partial_x f(x,y) < \frac{M_2}{2}, \tag{10}$$ where $M_1 = \max_x \{ \partial_x f(x,y) \}$ and $M_2 = \max_y \{ \partial_x f(x,y) \}$. Thus, we need to show that$ \| \nabla C(x,y)^T \|<1$ in order $C(x,y)$ to be a contraction mapping. We have

$$ \| \nabla C(x,y)^T \| = \| I - \begin{bmatrix} \partial_x f(x,y)/M_1 & \partial_y f(x,y)/M_1 \\ \partial_x f(x,y)/M_2 & \partial_y f(x,y)/M_2 \end{bmatrix} \| \leq 1 + \| \underbrace{\begin{bmatrix} \partial_x f(x,y)/M_1 & \partial_y f(x,y)/M_1 \\ \partial_x f(x,y)/M_2 & \partial_y f(x,y)/M_2 \end{bmatrix}}_{A} \|.\tag{11}$$ The eigenvalues of $A$ are the roots of

$$det(A -\lambda I) = 0.\tag{12}$$ Doing computations, we conclude that

$$\lambda (\lambda - \partial_x f(x,y) / M_1 - \partial_y f(x,y) / M_2 ) = 0 \tag{13}$$ which gives

$$\lambda = 0 \tag{14} \Rightarrow \| \nabla C(x,y)^T \|<1 $$ and $$\lambda = \partial_x f(x,y) / M_1 + \partial_y f(x,y) / M_2 < 1 \Rightarrow \| \nabla C(x,y)^T \|< 2 \tag{15}$$

where $(9)$ and $(10)$ are used in the last inequality. If we keep $\lambda = 0 $, $C(x,y)$ is a contraction by not in the case where $\lambda < 1$. Not a satisfactory result. Can we do better? Any comments are highly appreciated.

EDIT2: Please let me add some more information of the problem I am facing. Let $t \colon \mathbb{R}^2 \to \mathbb{R}$. I am trying to find a way to get the roots of the nonlinear equation

$$f (x,y )= 1/ \| t (x, y)\| - 1/ y \tag{16}$$ subject to $x \geq x_0$ and $y \geq y_0$. As far as I understand this is a single equation with two unknowns. Maybe this is the reason the mapping is not a contraction. Any information to understand the general behavior of $(16)$ along with the aforementioned discussion is highly appreciated.