ferris information 3285 glöps
- general:
- level: user
- personal:
- first name: Jake
- last name: Taylor
- portals:
- csdb: profile
- slengpung: pictures
- demozoo: profile
- cdcs:
- cdc #1: aether by mfx [web]
- cdc #2: Tsunami by Booze Design
- cdc #3: Aesterozoa by Kewlers [web]
- cdc #4: regus ademordna by Excess [web]
- cdc #5: Waillee by Prismbeings
- cdc #6: ▶ by Ümlaüt Design [web]
- intro Commodore 64 Triggered by Triad [web]
- the music isn't really my style but I like the look and overall package
- rulezadded on the 2020-04-02 16:52:06
- demotool Amiga AGA FixGift by Scarab [web]
- nice one
- rulezadded on the 2020-03-31 12:58:16
- demotool Windows squishy by Logicoma [web]
- Quote:
I guess you are using all that SIMD for ANS with some LZ77 using suffix sorts or something.
Actually, all that SIMD is for mixing predictions. Similar to later PAQ/kkrunchy there's a two-stage mixer that mixes the 72 (data section) or 135 (code section) context model predictions with 7 different weight sets (selected by additional contexts) in the first stage, and then it mixes those mixed outputs in a second stage with a single weight set. Both stages are processed 8 elements at a time; that's where the SIMD is (and I think the newest instruction I used was phaddw, which is just SSSE3; I did experiments with both more lanes and more precision and neither gave compelling results).
As for ANS, the main benefit there is, as you (indirectly) suggest, that you can mix different bitstreams naturally and be able to (de)code them in parallel. However this relies on the model being able to keep up (typically by updating more sparsely, and predictions need to be parallelized as well), and with a model this heavy (and certainly memory bound), there's no way we're doing any of it in parallel, so then we're left with a single stream. In that sense, ANS doesn't offer anything compelling at face value (though I did compare both an ANS decoder as well as the arithmetic one I went with just in case it had better precision or something, but neither really outperformed the other, and the arithmetic decoder was one instruction smaller and didn't require encoding the data backwards like ANS does!). And the whole multiple streams thing is a speed optimization that you typically trade some compression ratio for and, to my knowledge, none of the "max ratio" compressors are doing that simply because you get better compression performance with per-symbol model updates anyways (and we can afford this because we're only working with a few hundred KB of actual data here).
So yeah, no LZ/ANS shenanigans this time around (I have a C64 4k packer that actually does use that stuff if anybody wants to give it a test run!), just straight-up big, beefy context mixing. :)
I'll have many more gory details on my talk at Revision in a couple weeks btw - check the schedule/timetable for the time as it gets closer :) - isokadded on the 2020-03-31 10:44:42
- demotool Windows squishy by Logicoma [web]
- wysiwtf: cheers, thanks for clarifying. Yeah the already-compressed exe's won't work due to header packing already done by other packers, though I'll add a ticket for reporting a proper error in that case. As for the other exe's, the above comment applies :) thanks for testing!
- isokadded on the 2020-03-31 09:59:45
- demotool Windows squishy by Logicoma [web]
- Also "didnt do anything at all" is surprising - I would at least expect some kind of panic. Feel free to send me some test exe's (to yupferris/gmail) if you think I should have a look.
- isokadded on the 2020-03-30 16:54:00
- demotool Windows squishy by Logicoma [web]
- I've added a ticket to add a check/error for dynamic base address btw.
- isokadded on the 2020-03-30 16:48:16
- demotool Windows squishy by Logicoma [web]
- wysiwtf: I searched "too large" in the repo and the only error I can find is this: "Compressed size too large; can't adjust image base to make room for compressed image." Is that the error you're seeing?
You'd need quite a large compressed image and/or a low base address (it should probably be mentioned that you'll need a fixed base address, eg. `/DYNAMICBASE:NO` for MSVC) for that error to occur. The unpacker/headers/etc is placed beneath the original image in memory, and if the compressed size is larger than the image base, that's not possible. The maximum filesize uncompressed can be quite large, but if the compressed image is much more than 64k I wouldn't make any promises. It's probably safe up to 128k judging by the adjustment code but I haven't tested it very rigorously. It's quite likely that (de)compression is going to be horribly slow at this size anyways. - isokadded on the 2020-03-30 16:43:45
- demotool Windows squishy by Logicoma [web]
- hitchhikr: cheers, thanks for testing further. There's likely still bugs tho, so if you find any lmk!
- isokadded on the 2020-03-30 00:12:44
- demotool Windows squishy by Logicoma [web]
- hitchhikr: bug reports welcome, especially with more detailed info than that :)
- isokadded on the 2020-03-29 22:08:22
- demotool Windows squishy by Logicoma [web]
- been a long time coming :) feel free to contact if you have any issues/questions (especially this close to revision!)
- isokadded on the 2020-03-29 21:32:27
account created on the 2005-11-23 21:48:43
