I am trying to construct a regular expression describing the roman numerals less than 2,000 (excluding $0$ using the alphabet $\{ M,D,C,L,X,V,I\}$.
Here are my thoughts: We can divide our task into describing the roman numerals
(i) Numerals from $1-9$: $I|II|III|IV|V|VI|VII|VIII|IX$
(ii) from $100-900$: $C|CC|CCC|CD|D|DC|DCC|DCCC|CM$
(iii) $1000$: $M$
The challenge is merging these in a convenient way. My thinking is that we connect these ranges of numbers in this way: $M?(C|CC|CCC|CD|D|DC|DCC|DCCC|CM)(I|II|III|IV|V|VI|VII|VIII|IX)?$, however that excludes $CMXCIX$.
Any thoughts?
Your approach will work.
Note that you skipped the range 10--90: $X|XX|\dots|XC$.