I'm trying to understand how AC works. The thing that bugs me the most is how is probability chosen or calculated. In the given example message abcde produced the following result:
Symbol | Probability | Range
---------------------------------
a | 30% | 0.00, 0.30
b | 15% | 0.30, 0.45
c | 25% | 0.45, 0.70
d | 10% | 0.70, 0.80
e | 20% | 0.80, 1.00
How come a has probability of 30% and d only 10% is this chosen completely at random ? What if message was longer an included say another d would the probability for d be greater than ?
The calculation of the probability is not a part of arithmetic encoding. This is the task of the probability model. 'More likely' messages have a more efficient encoding.
If the examples, the probabilities are likely picked at random (or picked to make a good example).