I am looking at a compression technique that finds a permutation of the bits of a 64-bit block of data that collects all 1 bits at the beginning of the block, and sends all 0 bits to the end of the block.
The output of this operation is then a count value of the number of leading 1 bits in the block, and a permutation that sends all the bits back to their correct positions.
I have factored the permutation into a product of disjoint cycles and removed any fixed points.
Currently I find that the input 0x00000000ffffffff produces the most (32 transpositions) number of operations to restore the bits after encoding. This is far too many to store efficiently in the original space, as it takes 6-bits per value to store each permutation.
What can I do to compress the permutation? Obviously there are only 64-bits within which to store both the number of leading 1's and the permutation itself.
I suppose ideally I would like to find some kind of group generator that could reduce the entire permutation to some power of a single element.