OpenGL framework for 1k intro

category: code [glöplog]
Hey folks, I decided to make a new thread for this to keep info in one location to make it easy to search and keep the other thread more on topic. Discussion about 1k openGL framework started at the end of this thread.

First up I have to say I am not an expert but want to share the things I have learned. Hopefully more experienced people can contribute on ways to improve. Some things I found out by experimenting and other things are simply a combination of all advice I have read on pouet bbs and other places on the net. So thanks must go to Mentor and Blueberry for tips on how to make code more compressable, thanks to IQ for his framework which got me startedand sizecoding tips, thanks to Ferris, Las, Hitchhiker for sizecoding tips and Yzi for advice on converting C to assembly and optimizing for size.

Ok step one is to get an openGL window. There is only one way I know of to do this in C.
Code:const HDC hDC = GetDC(CreateWindow((LPCSTR)0xC018, 0, WS_POPUP | WS_VISIBLE | WS_MAXIMIZE, 0, 0, 0, 0, 0, 0, 0, 0)); SetPixelFormat(hDC, ChoosePixelFormat(hDC, &pfd), &pfd); wglMakeCurrent(hDC, wglCreateContext(hDC));

You can experiment with using C018 or C019 and see which compresses better. Then you want ShowCursor(NULL); somewhere, move it around your code until it gives the best compression result. For the pixel format descriptor I found I can get away with this:

Not sure how compatible this is but it works for me on win7.

Shortest way to make an openGL shader is like this:
Code:const int shaderID = ((PFNGLCREATESHADERPROGRAMVPROC)wglGetProcAddress("glCreateShaderProgramv"))(GL_FRAGMENT_SHADER, 1, &fragmentShaderCode);((PFNGLUSEPROGRAMPROC)wglGetProcAddress("glUseProgram"))(shaderID);

Then for your main loop you want it to be a "do/while" loop. This is the smallest way I found with your exit condition at the end. Minimum exit condition is "GetAsyncKeyState(VK_ESCAPE)" but you may also want to add time based exit.

In the loop you need this:
Code:glRects(-1,-1,1,1); SwapBuffers(hDC); PeekMessageA(0, 0, 0, 0, PM_REMOVE);

Again you can play around with moving the peekmessage anywhere in the loop to find where it compresses best.

Then of course your "ExitProcess(NULL);" is all you need to end.

That is the most basic way to get a fullscreen window with a fragment shader and hidden cursor. Play around with making data Static, sometimes it saves bytes and sometimes it gains bytes. Try each data item individually with and without static declaration.

Next thing you might want after this is time and sound. I will type up more about that soon.
added on the 2014-08-21 14:11:45 by drift drift
The shader init code failed. It should read this:

const int shaderID = ((PFNGLCREATESHADERPROGRAMVPROC)wglGetProcAddress("glCreateShaderProgramv"))(GL_FRAGMENT_SHADER, 1, &fragmentShader);
added on the 2014-08-21 14:14:02 by drift drift
Well fuck it idk what the problem is: glCreateShaderProgramv is obviously what it should say.
added on the 2014-08-21 14:15:00 by drift drift
A Looks, nice! ready-made project with all the correct compiler/linker options would be nice. I've always found that to be the hardest part for me.
added on the 2014-08-21 14:19:47 by Preacher Preacher
try to add some spaces between long words; otherwise pouet does it for you and fucks up html entities :)
added on the 2014-08-21 14:20:37 by bartman bartman
github dat shit
added on the 2014-08-21 14:21:06 by Gargaj Gargaj
For time there are a few options. Smallest option is usually just grabbing a timer from "GetTickCount()". However I found that if you are already using the multimedia library (winmm.dll) for something else then using "timeGetTime()" was either the same or a tiny bit smaller.

Now you either want to get time before the loop and store it in a variable then call time every loop and subtract the start time to get a count upwards from zero. The other thing is to calculate a time delta per frame. Depending on what you need the time for you can use either way but for example to make an exit condition after a certain time you would need to add the delta ticks to get a total time anyway.

If you are passing time to the shader you may want to divide it or multiply it by a small float to get a better timer frequency. You can either do this in C before you send it to the shader or do it in the shader code. Experiment to see which is smaller.

Now you can pass time and any other variables to the shader with a couple of methods. Smallest way is to put it into the "glColor" variable then access that inside the shader. This works ok sometimes depending on what you need as far as precision. You may run into problems with either low precision or overflowing the the variable and having it wrap around back to zero. Play around to see if it works ok for your situation. Other option of course is to pass time either as a float or an integer to a uniform:
Code:((PFNGLUNIFORM1FPROC)wglGetProcAddress("glUniform1f"))(((PFNGLGETUNIFORMLOCATIONARBPROC)wglGetProcAddress("glGetUniformLocation"))(ps, "shaderTimeVariable"), time);
added on the 2014-08-21 14:29:52 by drift drift
Ok yeah the code is being shitty. I wanted to write a general advice that people can adapt to their own situation. If I simply post my code it works for my very specific situation only. I'll post a visual studio project somewhere I guess.
added on the 2014-08-21 14:32:39 by drift drift
I guess I didn't want to do 100% of the work and only leave people to plug in their shader code. I think it is good also to read and understand something so you can change it and even improve it rather than use a framework without understanding how it works.

Anyway my last advice is about sound. You are very limited with what you can fit in 1k unless you have your own custom compressor. Only ways I know to get any decent-ish sound is either with the byte-beat technique, which there was a whole thread on pouet here.

Other way is to use MIDI, which to be honest I was surprised that it was an acceptable thing but all the top 1k prods use it so I guess its ok? You still can't do much music in 1k anyway. For midi you need to open the channel like this:
Code:static HMIDIOUT out; midiOutOpen(&out, 0, 0, 0, CALLBACK_NULL);

Then you use "midiOutShortMsg(out, DATA GOES HERE);" where you can send individual commands to change instruments and play notes and volume. You need to go look up what the codes are, there is plenty of resources on the internet. To do anything more than play 1 or 2 notes you need some sort of basic time based player with an array of data. I made a very basic player using an "if" loop with a time condition to step through an array. You won't get much notes unless you have a small shader or a custom compressor or something.

For bytebeat you want to make a data with a wave header then use the bytebeat method to fill the rest with the "music" and play it with "sndPlaySound((char*)&MusicData, SND_ASYNC | SND_MEMORY);"

Should be easy to find out how to do this but I can make an example if needed.
added on the 2014-08-21 14:43:47 by drift drift
Oh one other thing you can also have more options to pass shader variables if you have a vertex shader. But honestly if you don't need a vertex shader for any reason then you won't save any size if you have to put a vertex shader in just to get a slightly cheaper way to pass variables.
added on the 2014-08-21 14:50:05 by drift drift
I appreciate that you share your code :)

My 50 cent:
WS_MAXIMIZE is one of the most ugly ways to get a fullscreen window ;) and with glCreateShaderProgramv you often need the real resolution... So you either need calls to figure out the x/y resolution or use ChangeDisplaySettings.

You do not need glGetUniformlocation if you use layout qualifiers to set a uniform location from within the shader:
Code:layout (location = 16) uniform int samplePosition; allows you to use: glUniform1i(16, time);

To be able to use that you will need a "#version" pragma (e.g. "#version 430"). Not sure whether that's smaller but you can get rid of one more call + some strings with a little more shader code.
added on the 2014-08-21 14:56:20 by las las
Thanks for your input Las. You are one of the people I always see on here who is contributing to sharing knowledge and I already learned a lot from you. Having people more experienced than me sharing their methods is exactly one of the things I hoped for by starting this thread.

The way I posted for making the window is the smallest way I found. I agree with you it is not the best and many times you need to force a specific resolution anyway. I am not even sure if my method works on every system, I only know that it works for me.

The uniform location is excellent advice, I will have to experiment with that to see what size benefit it might have. Of course as you say you need to have a certain version of shader, I actually think it is version 4.4
added on the 2014-08-21 15:04:37 by drift drift
I was about to ask about sound, but you covered me. I was doing this so called bytebeat, no space for more, so I wonder how do they even have more interesting music? But MIDI, I never thought of that before. Now, I just need to read some resources on midi codes and try something :)
added on the 2014-08-21 15:33:42 by Optimus Optimus
I just googled and found everything about midi control codes, you only need a couple of codes really.

For example code C0 will change the instrument so something like midiOutShortMsg(out, 0x48C0); will change to instrument or patch 48.

Then you have 90 which plays a note and also the velocity which is basically your note volume so something like midiOutShortMsg(out, 0x007f6090) which plays note C-4 (60) at velocity 7F (midi controls are only 0-127 decimal range).

So you can make a table of midi control data and step through it with some sort of time control. But I warn you don't expect much under 1k :D
added on the 2014-08-21 15:46:30 by drift drift
Hey, thanks! It's nice enough to get a start.
added on the 2014-08-21 18:02:36 by Optimus Optimus
Keep in mind that the control byte (LSB) consists of a command nibble (high nibble) and a channel nibble, i.e. in that C0 or 90 example of yours, "0" is actually the first MIDI channel.
Some random tips
- You really, really, do want to write all code in asm, because that's the only way to control the code's statistical properties, and you can for example reorder instructions and try how it affects compression efficiency.
- You can reuse the bytes of the PIXELFORMATDESCRIPTOR for other purposes, and the references can compress nicely because similar pointers/addresses are used in other places as well. Examples of such purposes are naturally any other variables you might have, like the MIDI out handle or frame counter
- If you place your music routine in the main routine directly after the ExitProcess call, then you can jump from the music routine directly into the ExitProcess call. This is nice for checking for "end of song, end of intro" condition

What comes to music, I must say I haven't tried the "simple table of MIDI data + player" idea, I went straight to generating the MIDI data in code, because it seemed like the best thing to do - and most fun. The music routine in Stellar Driftwood's compo version takes 145 compressed bytes of code and 10 compressed bytes of data, excluding the MIDI out allocation. For the 1024-byte version that has an ending, I don't know the exact figure, because I changed the structure a little bit. I have no idea what it might be with a table, but at least I'm taking advantage of being able to run code, like for example calculating note velocity from "time modulo 3", and pitch sequence progression from "time modulo 4", which creates a nice long polyrhythmic pattern which doesn't repeat too quickly.
added on the 2014-08-21 19:25:20 by yzi yzi
Actually I wrote wrong, it's not note velocity it takes with modulo 3, it's octave shift. So I have multiply by 12 there... Anyway, I thought it sounded worth the bytes. The velocity thing I'm doing just with simple ANDing, "time modulo half-bar length", and of course all lengths are nice powers of two.

For drum patterns, I have two basic ideas, either AND time with an appropriate number and output the drum sound conditionally if zero or not, or then encode the drum sequence in a bit pattern that you ROTate one bit at a time. For other note sequences, all my routines use eight-note sequences where notes are encoded in nibbles of a 32 bit number, which is rotated 4 bits at a time. Rotate bass, melody, chords, what have you, with different speeds/intervals, to create more combinations. In the Japan 1k intro, I had an additional table for the traditional Japanese 5-tone scale, for translating basically any random garbage into something remotely sensible.
added on the 2014-08-21 19:33:55 by yzi yzi
Thats actually very interesting that your music is generated procedurally. I thought it sounded quite nice and structured so you obviously did a great job coming up with a formula that works well. I wondered how you got decent music in a small size.

I found the midi data compresses extremely well because it is generally very similar repeating patterns of numbers. But still it is quite limited to how much you can store in size, I think your idea might be better, I would be interested to try something like that myself and see how it compares. I am actually using a modulo and some other operator tricks against my timer already for playback since I don't actually store time data just note data.

Agree with assembly code, even since my post in the other thread of C vs Assembly I have gained even more bytes with optimisation. I think my code size difference compared to C is more than 20% smaller in assembly.

Of course then the shader size optimising is a whole discussion in itself.
added on the 2014-08-21 20:01:00 by drift drift
I'll have to try the data-heavy approach as well. Until recently I had no idea that a non-procedural approach would be possible at all. I did try some small tables of something, but it always just added dozens of bytes. Anyway, part of my problem is that I'd like it to sound like "music", with melody, chord progression etc., not just some sub-genre of electronic music. (the same applies to my music compo entries btw)
added on the 2014-08-21 21:31:55 by yzi yzi
A Looks, nice! ready-made project with all the correct compiler/linker options would be nice. I've always found that to be the hardest part for me.

Same, I would love a ready made project.

I have the ideas for a simple shader based 1k (would love to code a proper 1k), the issue is just this.
added on the 2014-08-21 21:38:24 by mudlord mudlord
+1 for ready project, preferably linux/windows/osx
added on the 2014-08-22 09:42:33 by visy visy
Some things I learned from coding Dystaxia :

Crinkler + openGL is very close to 1KPack + DirectX. 1kPack compresses your program as a lossless image and uses DirectX to decompress. This means you link to DirectX for free. I eventually used 1KPack as it was smaller, but that was before I knew the glCreateShaderProgramv call.

DirectX doesn't seem to need the PeekMessage call to avoid the "your program is not responding" cursor. I guess some DirectX call checks the messagequeue internally?

Working in assembler is a must, yes. You can f.e. push arguments for many calls on the stack first, then do several calls. This compresses better. With 1KPack, I also used some known register values from after the decompression to avoid storing my own values. That's why I think a ready-made project is a starting point at most, you have to hack it to fit your code optimally.

We used a MIDI track from a musician, not generated. Some hints:
- make sure you and your musician agree on the terms you use. Is a "pattern" the note data for 1 instrument, or for all? (The music was a good 100 bytes too big after compression, leading to more ugly hacks than I'm willing to share)
- Midi note off can be done with 0x80, or with 0x90 (note on) with velocity set to 0. The second compressed better in our case.
- storing tracks/patterns/whatever and a table with which pattern should play when is a waste of space. Just unroll all your notes in one big stream and let 1KPack/Crinkler handle it. With the simplified player code, it's smaller this way. We used 4 instruments with 2048 notes in total, most 0 of course.

I hope this help
added on the 2014-08-22 09:59:15 by Seven Seven
Here is my 5 cents to the discussion fwiw. Some stuff learned by doing 1k's

Indeed working with asm is practically a must, since the code has to be crafted as repetitive as possible to make it compress well. Also, we noticed that x86 code is still enemy of the compression, better avoid writing x86-code as much as possible.

We noticed that doing MIDI completely by playing MIDI-file was best approach. If you have code which does midi-events from some table, you have stuff both in code and data segments which the compressor needs to "learn". (I believe the yzi's all in code approach is really good as well. At least it sounds awesome)

In 1k it does not really matter if you have fancy compressor like crinkler or simple LZW/gzip, difference is not that much. Real enemy is the static overhead which comes from executable headers, decompressor, lib import, hashes, gl setup and such. In theory here linux/freebsd could shine since they have lzma installed by default plus they have SDL.

Some obvious (maybe not to everyone) stuff we have noticed:

- Shell dropping is not dead. at least not in *nix platforms. Our overhead for compression is 42 bytes. You just cant write a decent decompressor in that space. Also, with shell dropper you can compress the executable headers

- It is better to unroll loops where count is fixed and small. jumps, counters do not compress as well as two (almost) identical blocks.

- Same thing actually applies for the shader as well. shader minifier is not very useful in 1k shaders. (our shader actually expands if we run it through shader minifier). it is all about repetitive structures, shadowing variables names when not needed, moving variable declarations out of the main and funcs so that they can be done in a single place. Also, thinking what glsl builtins to use. (I was so happy when I got away with last call to normalize()) And basically all sorts of stuff you can expect to find from obsfuscated code...

- Fudging around the code is the most important bit. changing code from here and there whether it compresses better or not. some combinations are really counterintuitive. We have enabled compressions statistics from our tools that are counting bits vs. bytes so we can better optimize less than byte gains.

+1 for ready project, preferably linux/windows/osx

I've been thinking about sharing our osx framework. On the other hand that would allow more people to participate in 1k compos. but on the hand I'm afraid that sharing such a thing where you can do "drop shader here, put midifile there" framework would halt all the development seen in now 1k's, all the crazy ideas would come to halt and future 1k's would be copies of each other. So I'm not yet convinced which way to go...
added on the 2014-08-22 13:53:47 by ts ts
I am also hesitant about people publishing "demomakers", because I think not nearly all good ways of structuring a 1k have been discovered yet. I mean, if I had known that "everybody" is using table data based MIDI music, I don't think I would have spent so much time trying out different ways to make it procedurally.
added on the 2014-08-22 18:15:33 by yzi yzi