Fixed point iteration means that we apply a function $f$ repeatedly to itself and see if we can find an $x_n$ such that $f(x_n) = x_n$ or is at least close enough.
Let's say we want to use fixed point iteration to find the square root of some number $A$. We know we are looking for an $x$ such that $x^2 = A$ or equivalently $x = \frac{A}{x}$.
Then define our function to be:
$$f(x) = \frac{A}{x}$$
Let $A=2$, to find the square root of 2. Start with an initial guess of $x_0=1$. Then:
$$f(x_0)=\frac{2}{1}=2$$
And we continue to use $x_1=2$, again applying the function:
$$f(x_1)=\frac{2}{2}=1$$
And we are back at $x_2=1$ again. So the fixed point iteration will be stuck in a loop, oscillating around the number we want to find.
Now I read if we modify $f(x)$ and add $x$ on both sides we get:
$$f(x) = \frac{1}{2}(\frac{A}{x} + x)$$
Using this function, we will converge. This technique is called average damping.
What I wonder:
- Is there any special reason why we actually want to take the average of $x$ and $\frac{A}{x}$? I also tried to add $2x$ on both sides and ended up with $f(x)=\frac{1}{3}(\frac{A}{x}+2x)$ which also converges, but slower.
- Why does changing the function to return the average actually make sure that the oscillation is stopped and the iteration gets "unstuck"? I mean, I can't see a reason why just taking the average actually results in the correct answer at all. What's the reason this damping makes the function converge?
The equation $x=f(x)$ is equivalent to $x=(1-\alpha)f(x)+\alpha\,x\equiv g_\alpha(x)$ for any $\alpha\in\Bbb R$. If $\bar x$ is the solution, we try to choose $\alpha$ such that $|g_\alpha'(\bar x)|$ is as small as possible. In general, this needs some knowledge about $\bar x$.
When $f(x)=A/x$, $\bar x=\sqrt A$ and $$ g_\alpha'(\bar x)=-(1-\alpha)\frac{A}{A}+\alpha=-1+2\,\alpha. $$ $|g_\alpha'(\bar x)|$is minimum when $\alpha=1/2$.