Representation of rounding error in floating point arithmetic.

61 Views Asked by At

It is well known that in a Floating point number system:

$$ \mathbb{F}:=\{\pm \beta^{e}(\frac{d_1}{\beta}+\dots +\frac{d_t}{\beta^t}): d_i \in \{0,\dots,\beta-1\},d_1\neq 0, e_{\min}\leq e \leq e_{\max}\} \cup \{0\} $$

The rounding function $\operatorname{rd}: A \rightarrow\mathbb{F}$ (for $A$ chosen appropriately), satisfies the following property:

$\forall x \in A \ \exists \delta \in R: \quad \operatorname{rd}(x)=x(1+\delta), \quad |\delta|\leq u$

where $u:= \frac{1}{2}\beta^{1-t}$ is the unit roundoff.

I‘ve seen in many texts that the following also holds:

$\forall x \in A \ \exists \delta‘ \in R: \quad \operatorname{rd}(x)=x(1+\delta‘)^{-1}, \quad |\delta‘|\leq u$

But I‘ve never seen a proof. If I naively set: $(1+\delta)=\frac{1}{1+\delta‘} \Rightarrow \delta‘=\frac{-\delta}{1+\delta}$ but for negative $\delta$:

$$ |\delta‘|=\frac{|\delta|}{|1-|\delta||}>|\delta| $$

So potentially $|\delta‘| >u$. I’d be glad if someone could explain?

1

There are 1 best solutions below

4
On BEST ANSWER

If I understand the question correctly, for all $x \in A$ there exists a unique integer $e \in [e_\text{min}, e_\text{max}]$ such that $$ \beta^{e-1} \leqslant |x| < \beta^e, $$ and then $$ \beta^{e-1} \leqslant |\operatorname{rd}(x)| \leqslant \beta^e \ \text{ and }\ |\operatorname{rd}(x) - x| \leqslant \frac{\beta^{e-t}}2, $$ whence $$ \left\lvert\frac{\operatorname{rd}(x)}x - 1\right\rvert \leqslant \frac{\beta^{1-t}}2 \ \text{ and }\ \left\lvert\frac{x}{\operatorname{rd}(x)} - 1\right\rvert \leqslant \frac{\beta^{1-t}}2, $$ i.e., $$ \operatorname{rd}(x) = x(1 + \delta) = x(1 + \delta')^{-1} \ \text{ where }\ |\delta| \leqslant \frac{\beta^{1-t}}2 \ \text{ and }\ |\delta'| \leqslant \frac{\beta^{1-t}}2. $$