I am trying to look at how different floating points are stored in memory.
Firstly I looked at the System.Double (accessible by keyword Double in vb.net) which I think I understand. It is stored as follows:
$\pm (I+a)\times 2^b$
where
$\pm$ requires one bit.
$a \in [0,\sum_{k=1}^{52}2^{-k}]$ and consumes $52$ bits.
$b \in [-2^{10}+2, 2^{10}-1]$ and consumes the remaining $11$ bits.
So in total, this representation requires $64$ bits.
Note $11$ bits can represent a total of $2^{11}=2048$ distinct numbers while the set of possibilities of $b$ are $|[-2^{10}+2, 2^{10}-1]|=2046$. The remaining two binary representations of $All[0]:="00000000000"$ and $All[1]:="11111111111"$ are omitted to cover special cases of $\pm 0.$, $\pm \infty$ and different types of not a number, $NaN$.
$I$ is $1$ except for special cases of $b=All[0]$ where it is $0$ so doesn't consume any additional bit.
Like if $b=All[0]$ and $a$ also has all its bits $0$ then the resulting number represent $\pm0$.
Similarly, if $b=All[1]$ and $a$ has all its bits $0$ then the resulting number represents $\pm\infty$.
Similarly, if $b=All[1]$ and $a$ has some particular bits 1 then the resulting numbers represents various types of $NaN$.
Similarly, if $b=All[0]$ and $a$ has some particular bits 1 then the resulting $b$ is replaced by its minimum value $-2^{10}+2$ and since $I=0$, in this case, it allows for more closer numbers to $0.$ be represented (Subnormal numbers).
This is how I understand the entire Double (64bit floating point). Now when trying to understand the System.Decimal in the same spirit I am unable to decipher all the details.
I understand that decimal has the following structure:
$\pm a \times {10}^{-b}$
where $a$ is a 96 bit number so $a \in [0,2^{96}-1]$ while $b \in [0,28]$ is a scaling factor that decide where to put the decimal point. Thus it can store up to $28$ decimal places.
The documentation states that this number requires $128$ bits. Which means using the 96 bit for the $a$ and one bit for the sign, this scaling factor is using the remaining $128-96-1=31$ bits. Can somebody explain how this scaling factor is using the $31$ bits? It's presumingly has a base two representation of the scaling factor or something. Can someone who knows explain how the $31$ bits are used in this Decimal data type?
https://docs.microsoft.com/en-us/dotnet/visual-basic/language-reference/data-types/decimal-data-type