Need help coming up with (or finding) an image metric for $N\times M$ image.

43 Views Asked by At

So say you have the set of all unsigned $8$ bit grayscale, $N\times M$ images. This means there are $256^{NM}$ images in this space. If these images were binary, you could represent them with an $NM$ bit binary digit, but since your intensity values range from $0$ to $255$ this is not possible. I've been trying to come up with a way to order these images such that I can calculate a distance metric between two images. Intuitively it seems that would be possible because if you could come up with an ordering you could just find the distance from the origin (the origin would likely be the image with all $0$ values). I've looked around online but I've found nothing helpful so far.

1

There are 1 best solutions below

0
On

Trying to impose a total ordering on the set of images sounds like a really strange idea.

A slightly better idea would be to treat the images as if they were vectors in $\mathbb{R}^{NM}$ (if you make them grayscale), or in $\mathbb{R}^{3NM}$ (if you treat RGB channels separately). Now you can simply take Euclidean distance, or supremum norm, or any other metric already defined on $\mathbb{R}^d$. Depending on the application, you might consider transforming the colors into some other color space, like HSV, and only then treating the resulting image as a vector in $\mathbb{R}^d$.

An even better idea would be to look for more advanced feature selection methods, and then base your metric on them. You might start with a simple PCA (apparently, it was good enough for simple face recognition), and end with an arbitrarily complex machine learning algorithm that can actually understand what the image shows.