Compressing large arrays of floating point data?

category: code [glöplog]
dunno about storage.

theory: i would do it packed. probably more of a normal or displacement then a simple heightmap. i'd do dxt and 3d. maybe 2 or 3 mipmaps. trasnform and encode the fragments gradient direction with a static lookup table that fits nicely into the cache, add some random noise and combine the mips on the graphics card for the final.

might be not cool tho. :/
added on the 2013-04-19 00:34:07 by yumeji yumeji
Random semi-offtopic fact about delta encoding: It helps to know what it actually _does_.

What it does is applying a special (and numerically stable with integers) high pass filter to the signal (eg. a sample in an xm/it file or your height map). With proper deltas, it has a frequency response of -infinity dB at DC, and +6dB at the nyqvist frequency (and a for what its worth linear slope in between).

As in most signals the low frequency components have a way higher amplitude than the high frequency ones this has two effects: a) equalizing the spectrum and b) reducing overall signal amplitude significantly. While a) is actually bad for LZ style compression, b) directly results in less entropy, so if you're going for delta compression you NEED to have an entropy coder stage, like Huffman, arithmetic or even simple gamma coding afterwards. This is why the delta encoded samples in XM/IT are better compressible with zip et al. It actually makes things worse with high frequency content such as cymbals but luckily these samples are quite short and nobody ever notices :)

Random fact ends, please continue.
added on the 2013-04-19 01:49:35 by kb_ kb_
xtrim: I'm doing a similar erosion algorithm in mine (particles moving downhill), combined with a couple of other modification steps (random mini-avalanches on steep terrain, flowing of loose material above a certain slope). One thing it struggles with is the multi-resolution/fractal nature of erosion that you see in the real world. Maybe I'll try larger particles...
added on the 2013-04-19 01:58:13 by bloodnok bloodnok
bloodnok: Well there's a lot of different kinds of erosion that apply in mother nature... shores where the water waves hit the earth and implicitly take away some tiny stuff from it... wind that waves tiny little particles away... Quakes that shake everything around... fluvial erosion (caused by rain) where rain drops pick up particles and drops them in local minima... You could spend a LOT of time simulating realistically aged terrain.

Or you could go with an approximation and handpaint the rest ;)

kb_: interesting. When I've got the time i'll definitely try the delta code/entropy code part myself.

gasman: didn't know pngout. Will try that out too. thanks!
added on the 2013-04-19 09:34:59 by xTr1m xTr1m
gasman: Just tried out pngout. It shows no change on my high-frequency 11mb png. It only generates a 75% smaller version of my low-frequency 990kb png, which is negiligible. Thanks for the tip anyway :) I really appreciate all this collective knowledge!
added on the 2013-04-19 09:44:31 by xTr1m xTr1m
If you go for a 4k² heighmap, you can use a 2k² LSB image and introduce a bit of noise when expanding merging the MSB and LSB at 4k². This bring your LSB png8 down to 4mb
added on the 2013-04-19 10:18:45 by p01 p01
s/noise/clever midpoint prediction thingamajig
added on the 2013-04-19 10:24:11 by p01 p01
kb: but that was kindof my (and rrrolas, i suppose) point - terrain heightmaps are usually low frequency data, and if you're lucky, you can even probably reduce your amplitude from 16bit to 8bit, and THEN do the LSB/MSB split, because...

Gargaj: if you look at the MSB map http://www.xtr1m.com/temp/heightMap2.png then you'll see that this is very good for png compression. I'm thinking of deltacoding only the LSB map.

...the point is that the LSB map is high-frequency so your deltas will have considerably jumps in them: If the original values are 255, 256, 257, your LSB will be 255, 0, 1, and the delta will pretty much result in the same thing instead of the 1, 1, 1 you're looking for.
added on the 2013-04-19 13:01:15 by Gargaj Gargaj
I see. Good point Gargaj, will test that tomorrow
added on the 2013-04-19 13:04:06 by xTr1m xTr1m
While you're at it, try running the delta step twice. Delta is ideal for perfect linear slopes, 2 x delta is ideal for perfect parabolic slopes, 3 x delta for cubic slopes, etc. It's an easy parameter to tweak just to see if it makes a difference.

Of course in the end you might be better off with just a high-quality JPEG. But where's the fun in that.
added on the 2013-04-19 13:29:46 by doomdoom doomdoom
:) I <3 coding threads @ pouet!
added on the 2013-04-19 13:32:09 by raer raer
Speaking of pngout, there is also the PNGGauntlet, which is a frontend for several png compression tools, not just pngout. Usually the outpt between the optimizers only differs by a few bytes for small gfx, but who knows if it helps witth this kind of data...
One idea that I had long ago that makes perfect sense to me, but which I haven't tried out, is to transpose bits of 32 dwords of similar data (float data, 32bit image data, ARM instructions, etc). That should give you plenty of bytes that are just 0xff and 0x00, which compresses better than the original data..

Should do a trial run of that with various data for the heck of it.
added on the 2013-04-19 13:53:18 by sol_hsa sol_hsa
for some reason i like the frequency idea. how about splitting 16 into 3? low med high. shift nibbles in a +-128 range then delta hack around that. it's without big jumps. does that compress better? i dunno. later you just need to add it back together.
added on the 2013-04-19 15:03:21 by yumeji yumeji
Okay, tested the bit transpose. Doesn't work.

Ran it against the Canterbury corpus, as well as psd, tga, png, mp3 and some raw floating point vector data.

Result: in all cases except the tga, zlib compresses the result worse than the non-transposed version. In the tga case, transposing 8 bits was best, followed by 32 bits - and this was a 32 bit tga. So much for that idea..
added on the 2013-04-19 15:20:35 by sol_hsa sol_hsa
@sol_hsa okay. i guess the it's just that random bits stay random bits. barely compressible.
added on the 2013-04-19 15:37:00 by yumeji yumeji
question: would it work to combine 3 4-bit bc1 greyscales to 12-bit precision if rendered out onto a rendertarget? it's lossy but it's within the bit-range. doing that gradient thing i wrote one could need 2 bytes for 16 pixel. 4bitx4bit direction code and 2 greys. the whole heightmap with the low and med mips would end up 2.7MB. i just calced alil. theory tho. :D
added on the 2013-04-19 17:43:29 by yumeji yumeji
sol: I actually did that for shader bytecode in 4k's, it doesn't ALWAYS work.
added on the 2013-04-19 20:05:10 by Gargaj Gargaj