pouët.net

Interesting paper for sizecoders

category: general [glöplog]
@introspec : thank you very much for the link !
Spectrum BASIC calculator code (RST#28) proved to be more compact than the Z80 code at least in Snake121. Was Java bytecode checked? There are CPUs for it.
Quote:
Spectrum BASIC calculator code (RST#28) proved to be more compact than the Z80 code at least in Snake121


Who's like tiny intro which used only calculator?:)
added on the 2015-07-19 11:52:44 by g0blinish g0blinish
Quote:
anyway, packing of tiny intro doesn't makes sence.

Can't agree: on ARM each instruction is 4 bytes, but with a very rigid opcode organization, and Tony Haines found a way to compress code during a 1k coding competition and the idea was good enough to have Pervect/Topix improving the compression tool and everybody using it in further editions.

On 68k I used lz77 (42 bytes depacker) for a 480 bytes demo, and I also think that probably the limit is for 256 byters.
added on the 2015-07-19 15:51:48 by baah baah
My area of interest here is 1k size and modern platforms, with visuals done in GLSL shaders. I'm sure you can imagine that shader source code compresses pretty well. ;)
added on the 2015-07-19 16:03:58 by yzi yzi
RiscOS might be an interesting 1k platform, if you could get GLSL shaders and a General MIDI style OS-provided synth running easily. Those APIs are fairly light-weight on Windows.
added on the 2015-07-19 16:07:43 by yzi yzi
What i encountered was like:

4k-> Lots of different code, having some MACRO for the mainloop with lots of inputs for differing values is mostly some few bytes smaller than having full code for every mainloop.
( ->
#DEFINE mainloop(A,B,C,D)
float4 A(float2 texCoord)
{

float3 lightPosition=B;
float3 cameraPosition=C;
float curtCoolOfDepth=D;

{...rest of mainloop...}

}
)

8k-> Crinkler likes to have a lot of repeated code! Just have your Mainloops one after another, just don´t care as crinkler will have lots of fun crunching it away!
The small differences will account into final file size...but not as much as having the Macro+Definitions for all of your parts ;) ( i am talking of 10+ parts here! altho it worked out this way at 7 parts already! used it in one 4k with 6 parts aswelll, brought me like 45bytes compared to macro-mainloops, special case, tho! )

So, my advice is:
Read LESS Papers, COMPARE your crinklered_outcome yourselves! ;)

Thanks for trying to help, tho! ;) (now going to read the answers to this thread so far, only read the initial post yet! :) )
ok, now that i realized this thread is about tiny-intros, i feel a bit stupid posting prematurely...
...but i think even my input above helps: I wouldn´t use a packer for anything smaller 1k...the amount of crunched code/data won´t ever exceed the amount of handcrafted reductions! ;)
Maybe in 512byters, but that´s still a maybe and only worth a try if you have a version <512b already -> happy pimping ahead, if successful! ;) Or your 512b is like 550b...give it a try, but i predict it won´t work out...as others said and even my post above describes: you would have needed to code cruncher-friendly in the first place! :p
Me once exchanged two lines in my shaderCode, didn´t matter where they sit at all in the definition-code-block...
...Crinkler gave me 37bytes back! :D
I ofcoz tried to do this again and again and again afterwards, almost everytime i went into a position, where i just couldnt optimize anything else by hand anymore (we are still talking 4k, sorry!), but the results were like 0 to +25 bytes regain in 97%! (the 3% had a minus of best=6bytes, while i had one lucky one again at some point of time: 28bytes!) ;)
But i gave up on getting lucky! Since i started coding 8Ks i am trying to have a lot of repitition
in my code, even more than for 4ks, i feel good if i can have another code-block identical to another one, with a small difference at the beginning or the end! ;) :D

All this said: SizeCoding is about keeping it small, but sometimes the cruncher is all that makes your prod small enough in the end! ;)
Here's the idea for ARM codepressor by Pervect/Topix and Tony Haines:
Quote:
So, how does it work? Well, the basic algorithm is to for each word look
back in a range of 256 words and find the word which has the most identical
nibbles. Identical to the nibbles in the current word of course.

You can then store a byte, indicating which word had the most identical
nibbles, called distance, and another byte of which each bit indicates
whether the matching nibble is identical or not. After that you only store
the non-matching nibbles.

So, if the best matching word has 6 identical nibbles, then that could be
stored as 8 + 8 + ( 8 - 2) * 4 bits = 16 + 8 = 24, instead of 32.
If not enough nibbles match, then you store one byte indicating an
uncompressed word followed by the 8 nibbles making up that word.

As you might have figured out by now the above only generates real
compression for words whose best matching word has at least 5 identical
nibbles. Still, this results in a good overall compression as a lot of "best"
matches have 4, 5, 6, 7 or even 8 identical nibbles.

However, on closer inspection of all the data that was being generated by
this algorithm I discovered a few things:

- a lot of best matching words have a distance of 16 (words) or less
- in lots of cases the identical nibbles included the 4 top nibbles

So, instead of always using 8 bits to store the distance I now encode both
the distance and nibbles-mask as follows:

%0000 0000 - %0000 1111 -> %0 0000 - %0 1111
%xxxx 0000 - %xxxx 1111 -> %1 xxxx 0000 - %1 xxxx 1111

Where xxxx <> 0. So, if only the bottom 4 bits are set then a 0 followed by
the lower 4 bits are stored, otherwise a 1 bit, followed by all 8 bits.

This gave the compression ratio an extra boost :)

Furthermore, the algorithm really only works well on code. So, if there are
one or more data blocks in the program that you wish to compress then it's
very likely that most words in those blocks can not be compressed by the
algorithm and thus must be stored as "uncompressed words". However, as I
already said, and uncompressed word is stored by first storing a byte
indicating an uncompressed word and then the uncompressed word. If there are
lots of uncompressable words then this marker per word would seriously
increase the resulting file.

So, I added code to detect runs, or sequences, of uncompressable words. It
then stored a marker, followed by a count. After that all the uncompressed
without any further markers follow.
added on the 2015-07-19 19:53:03 by baah baah
Thanks Baah for nice find !
Let's suppose we have one-byte instructions with variable number of byte parameters. What byte size can give the most dense code? What instruction set it will be? Now I'm struggling with a 4-bit code, it definitely requires more instructions per "big operation" than usual. What about 5-bit codes? (I don't mean Huffman codes, they will certainly be denser, and they might reduce your memory footprint if you don't care about speed.)
Regarding first post, yes, of course 8-bit CPUs and also 8080 derivatives keeping legacy instructions first with CISC and then with RISC when they realized it was better are high on the list. To fit simpler things in a fixed small space, accumulator-based non-evolved CPUs do better. Who knew!? No need to read the paper or evaluate anything.
added on the 2015-07-22 00:09:23 by Photon Photon
But it's nice to know the next time I'm choosing a platform for my Windows intro.
added on the 2015-07-22 09:04:14 by yzi yzi
http://www.researchgate.net/profile/Fernando_Berzal/publication/220031827_Mining_transposed_motifs_in_music/links/0fcfd506997c259783000000.pdf

found this as i was searching for musical motif database..
as a list of all known motifs by creator..
anyway notices some similarities with the op's paper..
added on the 2015-07-26 16:18:33 by 1in10 1in10

login