Why discrete Newton converges faster than ordinary Newton?

63 Views Asked by At

I am find solve for this system $$ \begin{cases} cosh(y) + 2x = 45\\ \dfrac{x^2}{5} - y^2 + 10x = 500 \end{cases} $$ I use two Newton methods for solving. In first I get derivative value as analythical(i.e. just put the value in the calculated derivative) and in second(discrete Newton) I find derivative by definition(for example $f_x' = \dfrac{f(x, y) - f(x * (1 - eps), y)}{x * eps}$ ).

I give the algorithms the same initial approximations obtained graphically. Ordinary Newton converges about 120-150 iterations and discrete version converges in 3-4 iterations?

Why?

This is my code.

import sympy as sp
import numpy as np


def f1():
    x, y = sp.symbols('x y')
    return sp.cosh(y) + 2 * x - 45


def f2():
    x, y = sp.symbols('x y')
    return x**2 / 5 - y**2 + 10 * x - 500


def f(cur_x):
    x, y = sp.symbols('x y')
    return f1().subs([(x, cur_x[0]), (y, cur_x[1])]), \
           f2().subs([(x, cur_x[0]), (y, cur_x[1])])


def newton(cur_x):
    x, y = sp.symbols('x y')
    next_x = (cur_x[0] + 1, cur_x[1] + 1)
    W = np.zeros(shape=(2, 2))
    cnt = 0
    while max(np.abs(next_x[0] - cur_x[0]), np.abs(next_x[1] - cur_x[1])) >= 1e-6:
        if cnt != 0:
            cur_x = next_x
        cnt += 1

        W[0][0] = sp.diff(f1(), x).subs([(x, cur_x[0]), (y, cur_x[1])])
        W[0][1] = sp.diff(f1(), y).subs([(x, cur_x[0]), (y, cur_x[1])])
        W[1][0] = sp.diff(f2(), x).subs([(x, cur_x[0]), (y, cur_x[1])])
        W[1][0] = sp.diff(f2(), y).subs([(x, cur_x[0]), (y, cur_x[1])])
        inv = np.linalg.inv(W)

        f_in_point = f(cur_x)
        next_x = cur_x - np.dot(inv, f_in_point)
        next_x[0] = next_x[0].evalf()
        next_x[1] = next_x[1].evalf()

    return next_x, cnt


def get_der(f, xy, ind=0):
    eps = 1e-6
    x, y = sp.symbols('x y')
    res = f.subs([(x, xy[0]), (y, xy[1])]).evalf()
    new = 0
    if ind == 0:
        new = f.subs([(x, xy[0] * (1 - eps)), (y, xy[1])]).evalf()
    else:
        new = f.subs([(x, xy[0]), (y, xy[1] * (1 - eps))]).evalf()
    return (res - new) / (xy[ind] * eps)


def discrete_newton(cur_x):
    next_x = (cur_x[0] + 1, cur_x[1] + 1)
    W = np.zeros(shape=(2, 2))
    cnt = 0
    while max(np.abs(next_x[0] - cur_x[0]), np.abs(next_x[1] - cur_x[1])) >= 1e-6:
        if cnt != 0:
            cur_x = next_x
        cnt += 1

        lst = [f1(), f2()]
        for i in range(2):
            for j in range(2):
                W[i][j] = get_der(lst[i], cur_x, j)

        inv = np.linalg.inv(W)

        f_in_point = f(cur_x)
        next_x = cur_x - np.dot(inv, f_in_point)
        next_x[0] = next_x[0].evalf()
        next_x[1] = next_x[1].evalf()

    return next_x, cnt


def main():
    print('Newton start:')
    for x in [(-83, 6), (-82, -6)]:
        val, cnt = newton(x)
        print(val, cnt)
        print('f(x, y) = {}'.format(f(val)[0]))
    print('Newton end.')
    print('Discrete Newton start:')
    for x in [(-83, 6), (-82, -6)]:
        val, cnt = discrete_newton(x)
        print(val, cnt)
        print('f(x, y) = {}'.format(f(val)[1]))
    print('Discrete Newton end.')


main()
1

There are 1 best solutions below

2
On

Your code contains bugs. For example, your second W[1][0] should be W[1][1]. I'm also unclear why you add $1$ to each component of cur_x to create next_x, since the algorithm should only increment the iteration index. Any conclusions you've drawn from your experiment regarding the convergence rate are therefore spurious.