I really have no idea of how to do these questions - in fact I have no idea of how to do any question in the paper - but I have tried to figure out what's going on in the course called Computational Mathematics but the lecturer's notes are honestly useless to someone who doesn't have a strong maths background.
The course also has a high failure.
I'm trying to find materials online but the course isn't focused on just one topic, I even asked the lecturer for a recommended book but he said there isn't one book that covers the whole module, so I'm really stuck. Here a link to the exam paper. Link
Here's the first question from last year's paper:
Question 1.
(i) How many non-unique, non-normalised, numbers can be represented in a floating-point system defined by parameters $\beta, s, m, M$? $\tag*{ [5 Marks]}$
(ii) How many unique, normalised, numbers can be represented in a floating-point system defined by parameters above? Hint: it is proportional in some way to $\beta^{s-1}$ because no number other than zero itself can start with zero. $\tag*{[8 Marks]}$
(iii) Enumerate all the non-negative, non-unique, non-normalised, numbers in the floating-point system defined by parameters $\beta=4, s=2, m=-1, M=1$ $\tag*{[8 Marks] }$
(iv) Convert the numbers enumerated above into a floating-point system with $\beta=10, s=3, m=-1, M=1 .$ Comment on their distribution and some consequences for computation. $ \tag*{[4 Marks ]}$
Please note that I'm not asking for just the solutions but an explanation and probably a link, so that I can have a background knowledge and so that I'll be able to answer similar questions myself. This is not an assignment, I'm just preparing for an exam.
Thank you. :)
Edit:
2 $\quad$ Finite-precision floating point system - FPS
Let $F(\beta, s, m, M)$ be a system where
$\beta$ is the base, e.g. $2,4,10,$ or $16$
$s$ is the number of significant digits of the mantissa in base $\beta$.
$e \in Z$ is an exponent, $m \leq e \leq M$
Each number $x \in\{F\}$ has the structure $$ \pm \, \underbrace{d_{1} d_{2} \ldots d_{s}}_{\text {mantissa }} \times \underbrace{\beta}_{\text {basis }}\,^{\pm e\} \text { exponent }} $$ If $x \neq 0$ then $x$ is normalised if $1 \leq d_{1} \leq \beta-1$ and $0 \leq d_{i} \leq \beta-1, i=2 \ldots s .$ If $x=0$ then $d_{1}=d_{2}=\ldots=d_{s}=0$
What I can help is to provide an analogue using base 10 floating point numbers.
If it is non-normalized, then it has infinitely many non-unique representations. Examples are:-
6.25 = 0.625 * 10………..…… (1)
6.25 = 0.0625*100…………… (2)
6.25 = 625 * 10^(-2)……..… (3)
This is not a ‘healthy’ environment because a number has so many 'looking different' but in fact equivalent representations. In order to ensure the representation of a number is unique, normalization is necessary.
Normalization requires:-
I. All number should start as $0.d_1d_2d_3…d_s$ where the $d_i’s$ are the extracted digits.
II. The leading digit (i.e. $d_1$) must not be zero and other digits have no such a restriction. This is formally stated as $1 \le d_1 \le 10 – 1$ and $0 \le d_i \le 10 – 1$ for $i = 2, … ,s.$ At this stage, only (1) above can meet the requirement.
III. In order to make the so far representation numerically equivalent to the original, it must be compensated by multiplied a suitable exponent. That is, $*10^e$ for some suitable integer e; and e can be 0, + or –.
IV. If the number is 0, then ……
Thus, the normalized representation of $6.25$ is $0.6250000000...00 * 10^1$; the number of 0s appended depends on the size of the ‘container’ or ‘WORD’.
If the size of the WORD and the m and M (as in $m \le e \le M$) are given, one can find the smallest and largest number that this system can hold.
Example in addition of two floating numbers using a simplified representation
$6.25 + 703.94 = 0.625 * 10^1 + 0.70394 *10^3$
$= 0.00625 *10^3 + 0.70394 *10^3$
$= 0.71019 *10^3$
$= 710.19$
Note-1: Add./sub. must be done when the 2 operands are converted to the ‘same level’ first.
Note-2: It is possible that some data are lost due to conversion.
Note-3: The result might exceed the upper/lower limit (i.e. an overflow or underflow).
Example in multiplication of two floating numbers using a simplified representation
$6.25 * 703.94 = (0.625 * 10^1) * (0.70394 *10^3)$
$= (0.625 * 0.70394) *10^{1 + 3}$
$= 0.4399625 *10^4$
Note-4: Comment in Note-3 applies.
Note-5: Truncation may occur.
Further note:- In evaluating an expression via more than one steps, different orders or operations may yield different results. Example, computing the average of a and b by (1) $(a + b)/2$ and by (2) $a + (b – a)/2$ may yield different results due to errors like truncation.