Analytic interpretation of the iterative Newton's Method to calculate a root of a function

115 Views Asked by At

From an analytic point of view, we can express the function $f(x)$ using Taylor Series, with an initial point $x_n$:

$$ \begin{array}{lcl} f(x) & = & f(x_n + (x-x_n)) \\ & = & f(x_n) + (x - x_n)f'(x_n) + \frac{(x-x_n)^2}{2}f''(x_n) + \ldots \end{array} $$

and after breaking off to the first order,

$$\hat f(x) = f(x_n) + (x-x_n)f'(x_n)$$

we find the zero of the function, and the solution of the problem: $$\hat f(x) = 0 \iff x = x_n - \frac{f(x_n)}{f'(x_n)}$$

it is an approximation of the solution of the starting problem $f(x) = 0$, and it will be refined applying again the same procedure.


Here I don't understand three things:

  1. Why consider as initial point $x_n$, and not $x_0$ instead?
  2. Why $f(x) = f(x_n + (x-x_n))$ ?
    For this point, I have thought this: if $x$ is the next solution, and $x_n$ is the current estimation, the distance between them is $x-x_n$. Therefore to arrive to the next solution, I have to addition the missing distance: $x_n + (x-x_n)$. But I don't think it is the correct reasoning.
  3. Why to break off to the first order?

Please, can you give me any suggestion? Many thanks!

3

There are 3 best solutions below

0
On
  1. $x_0$ is a rather crude guess at the root. Hopefully, $x_n$ is much closer to the root, so the Taylor expansion around it is a much better approximation to the function near the root.

  2. $x = x_n + (x - x_n)$.

  3. You could use a second-order approximation: then you have Halley's method.

2
On

In order:

  1. You do start at $x_0$, which we hope is close to a root. Then we compute $$x_1=x_0-\frac{f(x_0)}{f'(x_0)}$$. Assuming our function has nonzero first derivative at the root, and $x_0$ was reasonably chosen at distance $\epsilon$, $x_1$ will be at distance about $\epsilon^2$ from the root. We then repeat to get $x_2$ from $x_1$, $x_3$ form $x_2$, and so on. When you write this with $x_n$, that is simply the $n$th step of this procedure ($n$ can be anything, including $0$).

  2. This identity is obviously true, as $x_n+ (x-x_n)=x$. The reason we want to use it is so that we can Taylor expand around $x_n$, which is the closest point to the root of $f$ that we currently have.

  3. You don't have to, we can expand to higher order assuming the function has sufficiently many derivatives and we will get faster convergence (in that you need fewer iterations). But the gain is at most a constant factor, and the actual amount of calculation may increase, since we need fewer iterations, but more calculation for each iteration. Also, this way, we only need to calculate a single derivative of $f$.

0
On
  1. Since you run the procedure repeatedly, "and it will be refined applying again the same procedure", you are producing a sequence of points. At each step of the procedure, you are using a method of false position: "If we imagine the root were here (at $x_n$), where does the method say the root actually is (at $x_n - \frac{f(x_n)}{f'(x_n)}$)?" If you ever actually get that the output equals the input, then $f(x_n)$ is necessarly $0$ and you are done. In all other cases, you start with an $x_n$ and get an $x_{n+1}$. You should not imagine that this sequence of values is just repetitions of the one value $x_0$.

  2. Taylor series converge on an open disk centered at the center of expansion. Truncations of the series exactly agree with the function at the center and gradually (with caveats) disagree with the function as the distance from the center increases. If you want to use a truncation as a high quality model of a function, the distance from the center must be small. (For more on this, read up on the remainder term in Taylor's theorem, which is a bound on the magnitude of the disagreement between a truncation and the function.)

    We expect that $x_n$ is close to a root of $f$, so truncating a series centered at $x_n$ should give a small disagreement between the function and the truncated series far enough to approximate the root. This is really more of a hope than a fact. If a function is sufficiently violent in a neighborhood of a root, the linear truncation may be a terrible approximation.

    So why write $x_n + (x - x_n)$? because $x_n$ is the center and we want to find a root $x$ that is close to $x_n$. So we write the argument as "$x_n + {}$ something presumably small". Now the power series expands in powers of $x - x_n$, so in powers of "something small". The hope, once again, is that $x - x_n$ is very small, so the successive powers of $x - x_n$ crush the contributions from the successive terms of the series. Once again, the plan is that, since $x_n$ is very close to a root, the first (few) term(s) of the series tell you everything you need to know about the behaviour of $f$ between $x_n$ and the actual root.

  3. If you break off at second order, you have to solve a quadratic equation and then you have two potential roots to investigate further. If we do this at every step, we end up with a bag of points, doubling in number at each step, each approaching a zero of $f$, but not necessarily all approaching the same zero of $f$. This is a serious bookkeeping problem compared to just following a sequence of points that converge to a zero of $f$. If we cut off at degree $3$ or $4$ , we have to solve a cubic or quartic, which is not so easy, and the number of points in our bag grows even faster. Generically, cutting off at degree $5$ or above, we can't even write down explicit expressions for the roots, so our progress is even further hampered.

    Notice that chopping off at degree $1$, if $f$ and $f'$ are polynomials and $x_n$ is rational, then $x$ is rational. Chopping off at degree $2$ risks introducing a new radical at every step, so the resulting expressions can become unwieldy. You may be thinking of this calculation using a bunch of decimal approximations from a calculator or a computer, but Newton was using this method by hand and was not passing through a sequence of decimal approximations.