File Compression Statistics

307 Views Asked by At

Statically, how is the compression of a file possible? As in, a file n bytes long can have $2^{n*8}$ (the 'times 8' part is because there are 8 bits in a byte). So, if you decrease the size then statistically it shouldn't not possible to decrease a file's size (compress it) and be able to perfectly reassemble the original file because the total number of different possibilities for the input will be greater than the output, so u cant exactly get the input from the output.

So, what is going on (in statistical terms)!?!?!?!



Please note that I am not trying to disprove compression (I ♥ compression (because I am a programmer)), I just want to know the statistics behind it.

1

There are 1 best solutions below

4
On BEST ANSWER

You have essentially rediscovered the theorem that says that any faithful compression algorithm must increase the size of some input files in order to compress others - one hopes, most others.

Google

lossless compression pigeonhole principle

for several links.

https://en.wikipedia.org/wiki/Pigeonhole_principle#Uses_and_applications

http://matt.might.net/articles/why-infinite-or-guaranteed-file-compression-is-impossible/

...