pouët.net

Getting into 4k demo making

category: code [glöplog]
I want to get into making 4k demos with d3d. I wanted to use d3d12 but I heard people prefer d3d9. I wonder what's the reason for that and which one is better to use as a beginner to d3d and any kind of gpu programming.
added on the 2021-08-30 16:01:09 by 66Gramms 66Gramms
I use OpenGL :) I think it's less trouble in the long run
added on the 2021-08-30 16:03:35 by NR4 NR4
D3D9 sounds way too old at this point.
If you're reluctant to use DX12, at least use DX11.
added on the 2021-08-30 16:16:05 by Zavie Zavie
Is the basic idea of 1/4/8/64k intro _for Linux/UNIX_ explained somewhere? Like is there some kind of API or what and how the code interacts with it?
There is a lot of stuff here: https://in4k.github.io/wiki/linux

- I used libx11 for window + context, libasound for sound output and the usual opengl / glsl / minify for the content. Packing with upx is viable.
added on the 2021-08-30 19:30:49 by NR4 NR4
Doing things in x86 machine code is a waste of bytes, because it compresses so badly. Do everything you can in higher-level languages such as GLSL, which compresses really well, and has a more expressiveness per byte. You want to have as few API calls as possible. Compofiller Studio's 4k template should be a good starting point for OpenGL, if that's OK.
added on the 2021-08-30 19:38:36 by yzi yzi
Quote:
There is a lot of stuff here: https://in4k.github.io/wiki/linux

- I used libx11 for window + context, libasound for sound output and the usual opengl / glsl / minify for the content. Packing with upx is viable.


upx sucks, it's outperformed by a lot by plain lzma from xz-utils. use vondehi (+ the python script for bruteforcing params) and you're good to go, it's been used in a number of prods. for 1k/4k stuff, it's probably better to use oneKpaq, it's too slow for larger size limits though

libx11/xlib and libxcb are horrible to use and also cause a large code size (as you also have to manually initialize glx or ecb), use GTK instead, or SDL when allowed in the compo rules. if you code these up manually in assembly (by having it in a shape that will compress well), and then use LTO etc. a lot for the CPU-side stuff of the demo, you can get these sizes down quite a bit more than in that repo

https://linux.weeaboo.software/Home at least used to be a bit more up-to-date than the in4k article (which ps asked me to update because at that point it was >10y out of date). or join #lsc on IRCnet, or ask around in #demomaking on the discord

as for audio, you'd probably either have to code your own synth, or do some ugly hacks to get the classical intro synths to work:
* the classical 4k intro synths are written in 32-bit x86 assembly
* Mesa or nv (the OpenGL drivers) are only available as dynamic libraries, it's not really feasible to link them in statically
* by now, dynamic libraries are only available as 64-bit versions by default on distros (and compomachines)
therefore, you need some way to run 32-bit code in a 64-bit intro. what I did was embedding another 32-bit ELF in the main one, purely for audio. it's horrible but it worked

I also have a linux wavesabre fork, though it is falling behind the main repo a bit by now

to get around the dynamic linking overhead, use smol (prods using it), it's getting pretty good lately imo

many recent intros (eg. blackle's, my one intro at evoke 2019, and some others) come with source code, so you can have a look at that, too

anyway, have fun!
added on the 2021-08-30 21:06:53 by porocyon porocyon
Compilers and linkers are overrated. Hand-crafted elf binaries with overlapping headers is where all the fun at (I did a few of Linux intros {128b..4k} 10 years {gasp!} ago).
added on the 2021-08-31 05:26:03 by provod provod
Sointu (my fork of 4klang) supports 64-bit Linux out of the box if that's the issue. Not sure how 64-bit code compares to 32-bit code when compressed. Also the tooling runs on linux, but don't expect a... user experience.
added on the 2021-08-31 06:25:18 by pestis pestis
Quote:
Compilers and linkers are overrated. Hand-crafted elf binaries with overlapping headers is where all the fun at (I did a few of Linux intros {128b..4k} 10 years {gasp!} ago).


smol and vondehi do that part already for you (see eg. this), you can always optimize these if you want though (haven't found the time to do this yet for oneKpaq, hint hint)

Quote:
Sointu (my fork of 4klang) supports 64-bit Linux out of the box if that's the issue. Not sure how 64-bit code compares to 32-bit code when compressed. Also the tooling runs on linux, but don't expect a... user experience.


oh yeah, right, sorry I forgot
added on the 2021-08-31 14:11:10 by porocyon porocyon
very helpful discussion so far... initial question: dx9 or dx12... 6 posts on linux... good going pouet! :D
added on the 2021-08-31 15:16:49 by maali maali
(the first and second replies already had useful answers)
added on the 2021-08-31 17:17:58 by porocyon porocyon
My relatively limited experience:

DX11:
- Well documented
- Learning curve is a bit daunting if you're not familiar with the render pipeline
- Easy to debug, tools are mature
- Suited for larger projects, OOP system gives better sense of ownership etc
- Rarely crashes; if it does it's usually obvious that it was your fault.
- Shader compiler relatively slow, needs to be bundled as a separate DLL, but not driver-dependent
- Not easy to size-optimize; only one imported but the OOP overhead takes it away

OpenGL:
- Not well documented
- Relatively easy to get stuff on screen
- No great development environment; the best debugging tool is RenderDoc which is unofficial
- Relationship / ownership between states / objects not obvious, everything is opaque handles, no enums, have to use some sort of extension loader to get access to modern stuff. The state machine nature makes it hard to know what's wrong.
- Can get it to crash by just seemingly doing things in the wrong order; since you don't see any pointers on the surface, hard to understand what you did wrong.
- Shader compiler pretty much instantaneous, but can depend on driver whether it eats your shader or not.
- Functional interface makes it easy to size optimize since they're just calls, no objects or anything.
added on the 2021-08-31 17:34:46 by Gargaj Gargaj
The smallest way to run a fullscreen shader is with a D3D11 compute shader.

You can get some inspiration from TinyDX11, though it will need a few changes to be completely applicable. Primarily, D3DX is typically not allowed in 4k compos nowadays, so at least two changes are needed:

- Use D3DCompile (from d3dcompiler_47.dll) instead of D3DX11CompileFromMemory
- Remove the "superdirty" hack to get __uuidof( ID3D11Texture2D )

And of course never, ever use GetTickCount (or timeGetTime) for the effect time in a demo. Use the sample timer from the music playback.
added on the 2021-08-31 20:29:07 by Blueberry Blueberry
I just updated TinyDX11 to better reflect the current state of windows and compo rules (making it grow some bytes..)
added on the 2021-09-04 15:51:44 by Psycho Psycho
Quote:
And of course never, ever use GetTickCount (or timeGetTime) for the effect time in a demo. Use the sample timer from the music playback.

ah, those times of while(exit) { doDemo(time); time++; waitVblank(); }
:D
actually you can use timeGetTime() whne you're not syncing to music yet, but after you got the sound, please no.
added on the 2021-09-04 17:53:04 by wbcbz7 wbcbz7
It gets worse: If you're using DirectSound, do not use GetCurrentPosition() for timing. DirectSound has been just a wrapper API for some time now, and in recent Win10 versions it's not sample accurate anymore (no idea when that started but I get a 100% repro rate here over all machines I tried - old FR intros/demos decidedly stutter a bit now).

So actually, as long as you don't use one of the current audio APIs, I'd advocate for timeSetInterval/timeGetTime or better QueryPerformanceCounter/-Frequency for timing the demo now. Over a couple minutes this should not diverge from sample time too much.
added on the 2021-09-04 21:26:47 by kb_ kb_
s/timeSetInterval/timeBeginPeriod. Sorry. :)
added on the 2021-09-04 21:29:35 by kb_ kb_
I think the last time DirectSound wasn't emulated was on Windows XP, so it's probably about time to move on anyway?
added on the 2021-09-04 21:34:22 by absence absence
Good to know. What's the lowest level or most native sound API these days then? (and is it a good alternative to PlaySound/sndPlaySound and DSound for size coding).
added on the 2021-09-05 22:24:15 by iq iq
The native API since Vista is WASAPI, but I'm not sure how it compares for size coding.
added on the 2021-09-05 23:32:18 by absence absence
I see. I haven't tried it myself so maybe there are shortcuts, but the official sequence of calls seems to be the following, which is long and heavy (average of 4 parameters each):

CoCreateInstanceProc
GetDefaultAudioEndpoint
Activate
Initialize
GetService
GetBuffer
Start
------
Stop
ReleaseBuffer
Release
Release
Release

Clearly obsolete APIs are convenient for 4k intros. Bless Microsoft.
added on the 2021-09-06 05:57:50 by iq iq
Quote:
It gets worse: If you're using DirectSound, do not use GetCurrentPosition() for timing.
oops (we really should move to another API; I think the only reason we haven't really is that the one time I looked into it, I discovered that WASAPI (in shared mode at least) requires that you use the device's sample rate, which was 48khz, and not 44.1khz like we currently expect, and I feared that changing the sample rate would affect the quality of some of the instruments that alias (eg. falcon), but I think this is mostly a non-issue, and even if it were, writing a proper resampler for two fixed rates isn't that bad either).

Quote:
So actually, as long as you don't use one of the current audio APIs, I'd advocate for timeSetInterval/timeGetTime
I disagree with recommending `timeGetTime` over
`GetCurrentPosition` (see this PR which was motivated by timing issues in this intro which were apparently really bad on yx's machine at the time; ofc this is somewhat anecdotal evidence and a single case, but it's real). But otherwise,
Quote:
..or better QueryPerformanceCounter/-Frequency for timing the demo now. Over a couple minutes this should not diverge from sample time too much.
+1
added on the 2021-09-06 08:48:56 by ferris ferris
Quote:
I haven't tried it myself so maybe there are shortcuts, but the official sequence of calls seems to be the following
also haven't tried myself, but this looks reasonable, and likely insignificant for 64k at least. For 4k, yeah it's certainly unfortunate.
added on the 2021-09-06 08:52:37 by ferris ferris

login