Compressing large arrays of floating point data?

category: code [glöplog]

On my next demo I want to have a terrain, very detailed and all. Then I noticed that my height map was 64MB big for 4096² and 16MB for 2048² using 32-bit floats. That's a bit huge for my self-imposed size limit of 64MB for the whole demo.

I already tried using unsigned bytes, it's not precise enough. I already tried zipping/RARing the raw heightmap stored as an 1D array of floats, the compressed file is not really much smaller.

Does anyone know a quick and already useable way of efficiently compressing (2D) floating point data? I'm talking about data consiting of power-of-two sized dimensions, two dimensions, one component per pixel, 32 bit floating point value per component. The target platform is Windows.

I've been playing around with following thoughts:
- BC4_UNORM texture compression. I'm using DirectX 10.1 in my demo so this should pretty much work out of the box. Does anyone have any experience with this and knows about the average compression rate?
- JPEG2000 lossy compression of unsigned 16-bit integers. Here I'm just guessing that 16-bit of uncompressed data is precise enough, and that the introduced compression artefacts don't matter for terrains. As far as I researched, 0.25bits/pixel still has a good image quality. Didn't try it though, so I can't tell how the compression artifacts manifest on the mesh.
- Take the array of floats and interpret that as an audio stream, and pipe it through a MP3 or OGG/Vorbis codec. Here I fear that I'll get discontinuities because this is a 1D compression, the locality of the data in the 2nd dimension is not considered.

I'd very much like to hear your honest opinion :) even if it's just a "WTF?! He seriously wants to save a heightmap as a MP3?!".

added on the 2013-04-17 23:40:42 by xTr1m

Some useful extra information would be the range of the data and the precision needed. Even if the height values themselves wouldn't fit in a byte, perhaps the deltas between consecutive samples would.

added on the 2013-04-17 23:51:52 by Marq

Have a look at this - I found it was a very nice overview, with some good insights: http://www.farbrausch.de/~fg/seminars/workcompression.html

added on the 2013-04-17 23:56:26 by hornet

All values are betwen 0 and 1, and as I said, I believe that 16 bit (half-floats) should suffice.

added on the 2013-04-17 23:56:33 by xTr1m

Out of curiosity, where does the heightmap come from ?

I would definitely try a quatree of quantized values; only storing the min and max float of each quad and quantizing to one byte. You might gain a few bytes by storing the min-max and using the 16bits index of the min and max per quad.

added on the 2013-04-18 00:00:14 by p01

how big is it if you just store it as 16bit ints with deltas?

added on the 2013-04-18 00:03:34 by Gargaj

Of course, the quadtree is just a first step. You can take all the quantized values, and store them as a grayscale JPEG. The error should be fairly low.

added on the 2013-04-18 00:04:07 by p01

Is the quadtree used for anything, or is it actually just walking some consecutive 8x8 (or similar) blocks...? Unless there's some more logic glue that detects that a bigger quad will fit into a byte. Lossy compression might introduce seams between the blocks, too. Better just try :)

added on the 2013-04-18 00:16:03 by Marq

Do you really need such big height maps? You could try a lower res map plus a detail height map or two.

I got really nice results with a big landscape with 3x 1024^2 8bit height maps - in fact using the same height map 3 times at different scales still looks good. You need repeating height maps of course.

added on the 2013-04-18 00:16:37 by psonice

...that the contents of a bigger quad can be quantized to bytes, without the need to go to the next level...

added on the 2013-04-18 00:17:35 by Marq

Your JPEG2000 idea seems pretty reasonable.
Note: if all your values are between 0 and 1 use a custom float not half-floats else you would waste bits.

added on the 2013-04-18 00:22:31 by ponce

The height map is just a bitmap, and there are many ways of compressing a bitmap. Which one is best suited would depend how the terrain actually looks. If the terrain is pretty smooth, you might try storing bytes or even nibbles and then use filtering to restore the smoothness. If it has many plateus, even run-length encoding is worth considering.

Since all the values are between 0 and 1, floats don't make sense either way. You're just wasting space.

A thing that definitely needs trying would be arithmetic coding on delta values. This would also make it easy to quantise to 11 bits per pixel or whatever.

added on the 2013-04-18 00:31:56 by doomdoom

I think floats don't make much sense for height maps – you don't need the detail around 0. I would do this:
- Map heights to integers: the highest value goes to (2^B)-1 and the lowest value to 0. Pick a B that's precise enough for you: 10-16 bits should be ok. Store the integers in two bytes.
- Do a 2D delta filter. Predict h[x][y] to be (h[x-1][y] + h[x][y-1] - h[x-1][y-1])*K + (h[x-1][y] + h[x][y-1])/2*(1-K) and store only the differences between the predictions and h[x][y]. Pick a K between 0 and 1 that gives the best compression.
- Add a constant to the differences (so that they don't jump between 0x000x and 0xFFFx). 0x8080 works well.
- Store the high bytes, then the low bytes.
- Use a standard compressor on the result (such as zip or lzma).

- If the heights don't need to be exact, store only 6-8 bits and do the lowest bits randomly or procedurally.

added on the 2013-04-18 00:59:32 by rrrola

Thig is, anything that is a delta based approach (+ huffaman) encoding has already been invented: it's called PNG (and yes, it can store 16 bit grayscale images).

If that's not enough, I'd totally go with wavelet based compression (JPG2000)

added on the 2013-04-18 08:49:29 by iq

PROCEDURAL.

added on the 2013-04-18 08:53:05 by maytz

Thanks everyone for those great answers! To answer some of your questions:

- my heightmap's "seed" should be procedural, I like to tweak erosion/plateaux parameters. But I won't precalc it in the demo, because:
- My artist will handpaint details on the heightmap, therefore it has to be stored in a file.

This is a screenshot of one of my procedural heightmaps:
BB Image

Here's one of the heightmaps: 2048x2048 png
screenshot and heightmap don't belong together, I uploaded both on different development times.

iq: didn't know that png already did that, gotta check that out as well!
psonice: I considered detail maps, but that would also distort the hand-painted details.... so I guess that's no option.
maytz: procedural generation already takes 45 seconds for that, due to the bruteforce and highly serial erosion algorithm I'm using. Precalc is not an option anymore :)

I'll definitely try the 16-bit grayscale PNG/JPEG2000/rrrola_suggestion first. I'll reply here when I get results :)

added on the 2013-04-18 10:15:42 by xTr1m

xtrim: one method i played with is having a multi-channel height map. 1st channel is height, 2nd channel can be used to select from various detail textures - this way you can select a 'forest' detail texture for your trees, and a 'rocky' texture for bare rock and so on. You could keep the 'distortion' minimal that way, or just mask out areas you don't want detailing.

Of course that's not going to help reduce the file size if you're adding more data to the texture ;) But you probably only need 2-4 bits for the 2nd channel. Maybe it could be combined with the height data, although keeping it fast is important too.

added on the 2013-04-18 10:45:57 by psonice

xTr1m: Would it make things easier/smaller if only the erosion + hand painted details are stored. Possibly at a lower floating point resolution, and the initial heightmap is generated ?

Anyways, I think the quadtree + fullsize 8bits grayscale JPEG approach would yield descent results. Not sure about the total compression ratio though :p

added on the 2013-04-18 11:09:21 by p01

p01: Regarding your question: not really. I initially start with an 8x8 seed heightmap and perform a series of erosion, diamond-square upscaling and plateaux steps, in any order i want, until I reach my desired size of either 2k²/4k². I see no way of splitting that info into two textures.

added on the 2013-04-18 11:39:10 by xTr1m

dont compress, p-generate. ptoblerm solved?

added on the 2013-04-18 11:39:18 by rudi

iq: I even found some working code for your suggestion: http://stackoverflow.com/questions/8818206/16-bit-grayscale-png. That's the first thing I'll try tonight

added on the 2013-04-18 11:43:48 by xTr1m

xtrim: cheers for creating this thread - might come in handy when I want to compress my 4096^2 float32 terrain :)

What sort of process are you using for erosion? You may think 45s of brute forcing it is a long time, but I leave mine for minutes before it looks half-decent.

added on the 2013-04-18 11:45:14 by bloodnok

xtrim: I think what p01 means is that you generate your texture, then you give it to the artist to hand-paint the details on the top. If you then subtract the generated image from the final one, you're left with the hand-painted parts only. If large parts are untouched, the hand-painted bit is highly compressible since it's mostly 0s.

So generate the terrain, then apply the hand-painted part which you're storing in a much smaller format.

If you're hand-painting, then eroding, then hand-painting more, then upscaling, etc., then it gets a lot more complex - might still be worth considering though.

Another option: build a simple paint tool that records brush strokes + brush textures, then you don't have to store the texture at all. Might be more work than you're up for though :)

added on the 2013-04-18 11:54:15 by psonice

As I said earlier, just the generation step is taking 45 seconds. You don't want to wait more than 30 seconds for a demo precalc. And that's not the only thing that would be precalc'ing ;)

As a reference, here's the same heightmap I linked earlier, rendered with L3DT. Notice how 8-bit precision are just not enough.
http://www.xTr1m.com/temp/8bit-terrain.png

added on the 2013-04-18 12:48:14 by xTr1m

bloodnok: An own implementation of fluvial erosion like the algorithm described here: http://www.bundysoft.com/docs/doku.php?id=l3dt:algorithms:hf:erosion. I apply it at different resolutions of my terrain. I start at 8x8 with seed values, perform diamond-square upscaling N times, apply erosion M times, upscale again, erode again, and so on. In between come some plateaux.

added on the 2013-04-18 12:57:43 by xTr1m

pouët.net

Compressing large arrays of floating point data?

login