I'd like to know about work on the partitioning of sequences:
Given a sequence $S$ with length $k$. Partition it into $n : n \leq k$ subsequences in order to maximize the information entropy $H$ of the frequency distribution $P$ of the subsequences.
You want to partition your sequence in such a way as to having a uniform frequency distribution of the subsequences.
Example given the sequence $S=<a,a,a,a,a,b>$ and $n=2$ We can partition it any way we like and have two unique subsequence such as $<a,a,a>$ and $<a,a,b>$
If we instead let $n=3$ two examples of partitions with different properties are $<a,a>$, $<a,a>$, $<a,b>$ and $<a,a,a>$, $<a,a>$, $<b>$. The first suggestion of partition have a lower entropy than the second, since the subsequence $<a,a>$ appears twice.
Are there any work done in this area and on problems like this?