Understanding convergence of fixed point iteration

2.3k Views Asked by At

I was reading some slides explaining the convergence of the fixed point iteration, but honestly I'm not seeing or having an intuitive idea of how fixed-point iteration methods converge.

Assuming $p < 1$ as for the fixed point theorem, yields

$$|x_{k+1} - x^*| = |g(x_k) - g(x^*)| \leq p |x_k - x^*|$$

This is a contraction by a factor $p$. So

$$|x_{k+1} - x^*| \leq p|x_k - x^*| \leq p^2|x_{k-1} - x^*| \leq \cdots \leq p^{k+1}|x_{0} - x^*| \rightarrow 0$$

The smaller the $p$, the faster the convergence.

Could someone explain me this?

Also have another problem. There's then a statement saying

For root $x_1^*$ we have the conditions for fixed-point theorem holding $|g'(x)| < 0.4$, and we expect faster convergence than with the bisection methods.

Regarding this last statement, I would have a few questions.

  1. What's the relation between the definition above, and the derivative of $g$ being less than $0.4$?

  2. Why would it be faster than the bisection method? I know that the bisection method converges linearly, but honestly I still didn't have an intuitive idea of these convergence rates.

3

There are 3 best solutions below

0
On
  1. By the MVT, $|g(x)-g(x^*)|=|g'(\xi)|\cdot |x-x^*|$ for some intermediate $\xi$. Hence we can take $p=0.4$ here.

  2. The bisection method halves the interval in each step (exactly and always), whereas the fixpoint iteration for our $g$ makes the distance to the fixpoint smaller by a factor $\le p=0.4<\frac12$.

0
On

From your slides you have a contraction mapping $g$, i.e a function with the following property: $|g(x)-g(y)| \leq p\cdot|x-y|$ where $p < 1$ and this holds for all $x$ and $y$ in the domain of $g$. For a fixed point $x^*$ we must have $g(x^*) = x^*$ by the definition of a fixed point, and by the construction of the iterative process we have that $g(x_k) = x_{k+1} \forall k$. From this, the first line of your slide follows: $$\left|x_{k+1} - x^* \right| = \left| g(x_k) - g(x^*)\right| \leq p \cdot |x_k - x^*|$$ What this is saying, intuitively, is that each time we apply $g$ to $x_k$ we move a little closer to $x^*$ – the distance between the current iteration and the fixed point shrinks because of the contraction mapping.

The size of $p$ matters for the speed of the convergence because $p^n \rightarrow 0$ as $n\rightarrow \infty$ faster the smaller $p$ is. If you consider $p=0.01$ and $p=10^{-6}$ then it should be obvious that $10^{-6n}$ is shrinking faster than $10^{-2n}$.

For the rest, Hagen's answer is elegantly clear.

0
On

Other answers cover most of the question. I added this answer because there seems to be a lack of explanation concerning the relation of $p$ and $g'(x)$.

Notice that the ratio of consecutive signed errors is given by

$$\frac{x_{k+1}-x^\star}{x_k-x^\star}=\frac{g(x_k)-g(x^\star)}{x_k-x_\star}\underset{x_k~\to~x^\star}\longrightarrow g'(x^\star)$$

Without the limit, the mean value theorem states this ratio must equal $g'(x)$ for some $x$ between $x_k$ and $x^\star$. These allow you to

  • bound the ratio of the errors (how fast fixed-point iteration is guaranteed to converge).
  • find the asymptotic ratio of the errors (how fast fixed-point iteration eventually converges).

It's worth noting that this only works for one-dimensional differentiable functions. More generally you will want to look into Lipschitz continuous functions, where $p$ is the Lipschitz constant.