Why computing $\sin(x)$ is not backward stable?

704 Views Asked by At

It is said in Trefethen's Numerical Linear Algebra that computing $\sin(x)$ shall not be expected to be backward stable, because "the function has derivative equal to zero at certain points", for example at $x=\pi/2$. However, I am not satisfied with the arguments therein. In particular, suppose $x=\pi/2-\delta$. When $\delta$ is sufficiently small, we will have $\tilde{f}(x)=1=f(\tilde{x})$ with $\tilde{x}=\pi/2$. Thus $\|\tilde{x}-x\|/\|x\|=\delta/\|x\|\rightarrow0$ as $\delta\rightarrow0$. Then why is computing $\sin(x)$ not backward stable (at $x=\pi/2$)?

I would also like to know if it is a general rule, that if a function $f(x)$ has zero derivative at certain points then computation will not be backward stable.

Thanks for any comment.

Note: Here is the definition of backward stability in Trefethen's book. We say that an algorithm $\tilde f$ for a problem $f:X\rightarrow Y$ is backward stable if for each $x\in X$, there exists some $\tilde x$ with $$\frac{\|\tilde x-x\|}{\|x\|}=O(\epsilon_{machine})$$ such that $\tilde f(x)=f(\tilde x)$.

1

There are 1 best solutions below

5
On BEST ANSWER

The issue is not at $x=\frac\pi2$ 'exactly' (where, yes, 1.0 can be correctly evaluated) but rather in the vicinity. Note that the actual $\tilde{x}$ for which $\sin(\tilde{x})=1-\epsilon$ is $\frac\pi2+\sqrt{2\epsilon} + O(\epsilon)$. Now, if we look at values of $x$ around $x=\frac\pi2+\frac{\sqrt{2}}2\sqrt{\epsilon}$ in the midpoint of this range, for instance, then either $\tilde{f}(x)=1$, so $\tilde{x}=\frac\pi2$, and $|x-\frac\pi2|\approx C\sqrt{\epsilon}\not\in O(\epsilon)$, or $\tilde{f}(x)=1-\epsilon$, so $\tilde{x}=\frac\pi2+\sqrt{2}\sqrt{\epsilon}+O(\epsilon)$, and again $|x-\tilde{x}|\approx C\sqrt{\epsilon}\not\in O(\epsilon)$.

And yes, you're exactly right that computation won't be backwards stable for any 'sufficiently nice' function $f(x)$ in the vicinity of zeros of $f'(x)$. This is because of the Taylor expansion: Near one of these zeros $x_0$, we have $f(x_0+\delta)=f(x_0)+\delta f'(x_0)+O(\delta^2) =f(x_0)+O(\delta^2)$ and so the 'area of confusion' for values differing by $\epsilon$ is of size $\Theta(\sqrt{\epsilon})$.