Consider the set of machine numbers $M(10, 2, 0)$. (The "zero-length" for the exponent is to be understood such that there is only the sign ± and 0 available for the exponent. We interpret "+" as "+1" and "−" as "−1". The available exponential factors are thus $10^{+1}, 10^0, 10^{−1}$.)
Perform the addition \begin{equation} 1+\frac{1}{2}+\frac{1}{3}+...+\frac{1}{12}\end{equation} first from left to right \begin{equation} (...((1+\frac{1}{2})+\frac{1}{3})+...)+\frac{1}{12}\end{equation} and then from right to left \begin{equation} 1+(...+(\frac{1}{10}+(\frac{1}{11}+\frac{1}{12}))...)\end{equation} in the set $M(10,2,0).$
Start by first mapping the set of real numbers $1,\frac{1}{2},\frac{1}{3},...,\frac{1}{12}\in \mathbb{R}$ onto $M(10,2,0)$ via the "float operator" $fl$. For simplicity, we assume that $fl$ works by chopping off the digits for which there is not enough storage in $M(10, 2, 0).$
Which summation order gives the more accurate result, when compared to the results of the same calculation performed in $\mathbb{R}$?
So I'm studying ODEs and this exercise just came up. I can't figure out how to do it correctly. My idea is that we convert all the fractions into exponential notation, i.e. $fl(\frac{1}{6})=1.66\cdot 10^{-1}$. but I'm not sure how to compute the following:
first from left to right \begin{equation} (...((1+\frac{1}{2})+\frac{1}{3})+...)+\frac{1}{12}\end{equation} and then from right to left \begin{equation} 1+(...+(\frac{1}{10}+(\frac{1}{11}+\frac{1}{12}))...)\end{equation} in the set $M(10,2,0).$
Should I convert all the fractions and then add them up or should I do it like this:
\begin{equation} fl(fl(...fl(1+\frac{1}{2})...)) \end{equation}
EDIT: I just tried it again. Let's consider from right to left, then I have \begin{equation} fl(\frac{1}{11}+\frac{1}{12}) \end{equation} Now I continue to \begin{equation} fl(fl(\frac{1}{11}+\frac{1}{12})+\frac{1}{10}) \end{equation} And \begin{equation} fl(fl(fl(\frac{1}{11}+\frac{1}{12})+\frac{1}{10})+\frac{1}{9}) \end{equation} If I keep going I end with 3.101
If I do from left to right by the same method, I get 3.1. Is it the correct way to do it?
The point of this exercise is to show how the inherent finiteness of floating-point representations affects calculations. Therefore the second approach is correct - i.e convert all fractions to floating point, then perform the indicated calculations -in floating point also-.
I leave it to you to figure out why the second approach is more accurate.