How to increase the compression of an RGB image from a typical smartphone camera from 2:1 to perhaps 3:1?

226 Views Asked by At

I have been experimenting with a simple (fixed resolution) image file compressor that I designed and wrote from "scratch", for fun, and to keep my programming skills "sharp". So far I am only getting about 2:1 compression on a typical 24 bit RGB image (1024 x 576 fixed pixel size). I am trying to keep the algorithm simple. I know I can only get mild compression, but that is okay, since most of the byte savings are when you get 2:1, 3:1... compression. For example, the BMP (bitmap) image (24 bit color) at 1024 x 576 is about 1.68 MB. The output filesize I am seeing from my algorithm is about half of that. While that is decent, I would like an "easy" way to get better compression, without turning this into a programming nightmare.

So far I am only using 2 steps to get compression:

Step 1: Reduce the 24 bit color space to 4 bits for blue, 6 bits for green, and 5 bits for red. This alone reduces the input filesize (1.68 MB) to 5/8ths size which is about 1.05 MB). This was done experimentally using too much and too little color reduction, until a good visual and good compression "balance" was met. Several test images were used for this.

Step 2: I look for repeating exact patterns in the Blue, Green, and Red pixel groups (using separate lookup tables). Specifically, I take a group of 16 blue pixels, pack them down to 8 bytes (since they are 4 bits per pixel), enter that 8 byte pattern into a lookup table, and if I see it again (exact match), I encode it into a 2 or 3 byte lookup (depending on where it is in the lookup table). Green has a groupsize of 8 pixels (which gets packed down to 6 bytes, then entered into the green lookup table), and red (like blue), has a groupsize of 16 pixels (but unlike blue, gets packed down to 10 bytes).

Two things I might try are the following:

  1. After I create this output file, try to bitpack it 8:7 by scanning it for 128 unique characters on the input, then compressing each 8 bytes into 7 bytes (but of course I would have to tell the decoder which 128 unique characters I saw on the input block, so there is 32 bytes of overhead there). Of course I would first check to see if this method would reduce each block in size, before I would actually use it, otherwise I would just store the block as is to the output file.

  2. I might try a blocksize of 16 pixels for green (like I use for both blue and red), but because green has the least color reduction (from 8 to 6 bits), using a pattern of 16 pixels is less likely to be repeated as an exact match elsewhere in the image, so it is a tradeoff of more compression for each occurrence of a repeated pattern, but fewer of those occurrences.

Another thing I am wondering about is if there are similar patterns across the 3 color "channels", so that I could have some "generic" lookup table for any color, however, that is made more difficult by the fact that each of the 3 color "planes" have different group sizes when packed (blue = 8 bytes, green = 6 bytes, red = 10 bytes). For example, I first process all of the blue pixels, then all of the green, then lastly, all of the red (that is the order they are stored in a BMP file so I preserved that order). So I am wondering if after processing the blue pixels (so I will then have a full blue lookup table of patterns), if when processing green, perhaps an identical (or very similar) pattern will have already been seen in the blue lookup table, and whether I can use that or not to encode.

Another thing I am considering is having multiple lookup tables for each primary color. Blue for example. Instead of having just one lookup table of fixed length patterns (8 bytes), perhaps I can also have another table with 12 or even 16 bytes, and check that table first for a pattern match, and if not found, then check the smaller length pattern lookup table for a match. Reason being that the longer matches encode more efficiently (compress better due to less overhead).

So what else can I do that is not so complicated, but can improve compression? Should I think about partial matches, near matches, longer matches, variable length matches... instead of just fixed length exact matches for each of the 3 primary color patterns?

By complicated, I mean I don't want to use Huffman codes or equivalent, because they are not so easy to implement. I can use lookup tables cuz those are easy to use for me.

Attached is my main test image that I am getting about 2:1 compression on, but this is in JPG format, compressed about 8:1.

enter image description here

Some of you may be wondering why I am even bothering with something like this, when JPG (from year 1992) can compress about 10:1 with very little visual difference from the original image (24 bit color BMP). The answer is because JPG is flawed (unless the minimal compression setting is selected), as it tends to create a "haze" around high contrast areas (such as thin tree branches against a bright sky). Also JPG has a "generation" problem where if it is used multiple times on the same image, the quality gets worse each time. My algorithm should (in theory), not get worse if it is "recompressed" multiple times. If I can get maybe 3:1 overall compression I would be happy, but I am currently only at about 2:1, which is the reason for this post... to improve it some.

Someone commented about trying PNG which uses lossless compression similar to that of a zip file. Using the original BMP image (1.68 MB), PNG compressed it down to 1.27 MB (to about 75% of its original filesize). Note that the PNG image was not color reduced like mine was. My format has reduced colors (24 bit RGB down to 15 bit RGB), but as I already stated, for typical images (like those from a camera), it is hardly noticeable in most cases.

Someone else commented on "small" files not compressing well. As I mentioned, JPG can compress this "small" image 10:1 with not much visual difference (unless you zoom in), so that is not the problem. The problem is finding a better method of lookback/lookup encoding to compress repeating patterns.