Looking for a simple way to compare tiny images

category: code [glöplog]
I'm working on a thing that needs to do some basic image comparison of tiny images, around 16^2 pixels. I don't want to compare the things pixel by pixel as they might be 1 or 2 pixels offset or something. Any idea on how to do it?
I'm currently doing something like blurring both images and calculating the error (RMS), and it kind of works, but I would like to know if there's something "standard" for this sort of things.
added on the 2018-08-16 17:19:25 by xernobyl xernobyl
By "comparing" do you mean you want a "distance" or just test for equality?
added on the 2018-08-16 17:28:52 by Gargaj Gargaj
If it's for block equality, I would try a "block digest" function which creates an approximation of the block. Sort of like a hash. Candidate replacement or "similar enough" blocks are those that get the same digest. Make a few different digest functions so it doesn't depend on exact rounding. So, each block goes to several groupings. When you want to find a replacement for a block, get all digest groups it belongs to, and do the final selection more accurately. Or if you want to find the best overall combination of replacements, look at how many groups a block belongs in, and select the "most median" blocks. Or try to construct new blocks that aren't necessarily exactly the originals but could replace many of them. One digest function might look at brightness values only, others also color hues, and whatever color spaces you think is important for visual similarity. (These are ideas I thought about when making graphics converters for MSX, where the number of blocks/characters available is limited)
added on the 2018-08-16 17:56:50 by yzi yzi
equality, as in "probability of being this".
added on the 2018-08-16 18:46:01 by xernobyl xernobyl
xernobyl: yeah, you didn't specify the color bit-depth in your first post, but as you talk about "color hues" I guess its something like 24- or 32-bits per pixel.

However, one standard way of doing things is to think of each bit as one "frame" in the overall comparison-scheme. Then apply the same technique/method or whatever you're using on all the frames. One might use distance-metric (not sure if that is what Gargaj meant to say).

For a 24-bit color image, there are 6144 bits in your image, so that means that the total range is 1 out of 3,38e+1849 combinations, which is a very big number and hard to guess out of an Pseudo Random Generator that has no presumption of what it should recognize from your pattern. A simple method would be to use euclidean distance as metric in R^2, and use these weights as error measurements. Hope this can give you some directions.
added on the 2018-08-16 21:09:47 by rudi rudi
First, you'll need to decide what you want to be invariant to. Translation? Rotation? Brightness changes? This will have an impact on your algorithm.

Assuming the images are anywhere near natural images (ie., not just random noise), what you want is a way to align them. There are plenty of ways of doing this (e.g. gradient descent), but if you're only at 16x16 and the possible offsets are only 1–2 pixels each way (possibly in half- or quarter-pixels?), brute force is likely to work best.
added on the 2018-08-17 18:54:32 by Sesse Sesse
you are looking for SSIM or a variant (SSIM2, CWSIM, etc. ) these were specifically designed for your purpose.

Start here : SSIM
added on the 2018-08-17 19:24:32 by HellMood HellMood
SSIM still wants alignment, though, if you can. But it's better than just directly subtracting, if and only if you talk about human perception. Again, it's important to figure out what the question is before looking for answers. :-)
added on the 2018-08-17 22:08:24 by Sesse Sesse
SSIM is a good starting point for research. He also could train an autoencoder and measure the difference in latent space, but I agree he should specify the case before delving deeper into possible answers ;)
added on the 2018-08-17 22:17:21 by HellMood HellMood
SSIM seems to be a great fit for my particular problem, thank you.
added on the 2018-08-20 17:11:22 by xernobyl xernobyl
since I know it works great for audio, why not try something in the frequency domain. fft plus some kind of preprocessing (esp. with regards to normalization). really depends on the desired invariants though. working in a particular color representation, esp. separating color and brightness might help. I'd take a look at opencv too, it provides some interesting tools.
added on the 2018-08-20 19:56:32 by jco jco
Substract image from its negative and count all non-zeros?
added on the 2018-08-21 13:09:56 by rutra80 rutra80
added on the 2018-08-21 13:31:38 by g0blinish g0blinish
@rutra80 that would work for 100% similar images, mine might be a few pixels of
added on the 2018-08-21 18:14:59 by xernobyl xernobyl
if we're going into 3rd party soft, then xnview has such a function with definable threshold
added on the 2018-08-21 18:20:13 by rutra80 rutra80
we're not.
added on the 2018-08-21 19:29:24 by xernobyl xernobyl