pouët.net

Fast Plasma Effect

category: general [glöplog]
 
Hi !

I have written some bad C# code with use of SDL.NET. I want to make a fast Plasma Effect. Its meant for a animated background so it must be quick :).

Can someone give me some faster ways to edit a surface in SDL, or see for yourself.
Code: public virtual void UpdateEffect() { int w = surface.Width; int h = surface.Height; if (cls == null) { cls = new int[w, h]; for (int x = 0; x < w; x++) { for (int y = 0; y < h; y++) { cls[x, y] = (int)( (127.5 + +(127.5 * Math.Sin(x / 32.0))) + (127.5 + +(127.5 * Math.Sin(y / 32.0))) + (127.5 + +(127.5 * Math.Sin(Math.Sqrt((x * x + y * y)) / 32.0))) ) / 3; } } } Color[,] colors = new Color[w, h]; paletteShift = Convert.ToInt32(Environment.TickCount/1000 ); for (int x = 0; x <w; x++) { for (int y = 0; y < h; y++) { colors[x, y] = palette[(cls[x,y] + paletteShift)%255]; } } surface.SetPixels(new Point(0, 0), colors); }
added on the 2007-10-10 00:40:58 by Marijn Marijn
SDL is slow by concept - if you're using C# and .NET, why dont you just go DirectDraw? would save the hassle - at least most of it... (sure, it's deprecated, but only because everything is accel now)
added on the 2007-10-10 02:01:41 by Gargaj Gargaj
Exploring sw rendering can be a nice thing.

Some quick random hints looking at the code:

1) Always try to use look-up tables when it comes to computationally heavy stuff like sin,cos,tan,sqrt and friends (this should give you a good speed-up...)

2) Using integer math / fixed point math still can improve speed a lot expecially when math precision isn't a must

3_minus)
- multiply for (1/x) instead of dividing for x (if x is constant)
- when doing modulus operations where the divisor is a power of two (x % 2^n) (in your code "... % 256" should be) you can replace it with a faster bitwise AND where "(x & (2^n)) - 1", i.e. x % 256 <-equals-> x & (256 - 1).
- when dividing/multiplying for powers of two (integers) you can use bit shifting (i.e.: x * 32 <--> x << 5). Probably compilers already do these last two kinds of optimizations.
(bitwise operations)

Found also this site, maybe you'll find it useful: http://student.kuleuven.be/~m0216922/CG/index.html.
added on the 2007-10-10 02:46:56 by bdk bdk
actually, cls doesn't seem to be time-dependent at all, so just compute it once (you don't even need sine tables for that, woohoo!).

and using an actual paletted image format and actual palette rotation instead of doing a paletted->truecolor conversion by hand in managed code each frame (it should be either %256 or &255, by the way) should make it run blazingly fast without any real work :)
added on the 2007-10-10 03:03:48 by ryg ryg
ryg: smashing your lcd gives an even cooler plasma-effect in no-time!
added on the 2007-10-10 03:09:36 by kusma kusma

IF the surface width is a power of 2 and the total pixel count is a multiple of 16
IF your buffers are in user memory
then try this


change your loops from this

buf = new int[x, y]

for(int x = 0; x < w; x++)
{
for(int y = 0; y < h; y++)
{
buf[x, y] = code here
...

to something like this

int Log2(int val)
{
ASSERT(val > 0)
int res = -1;
while((1 << ++res) <= val);
return(res - 1);
}


mask = w - 1 //w must be a power of 2 remember
bitshift = Log2(w)
pixelcount = w * h

//initialize buffer like this
buf = new int[pixelcount]

for(int j = 0; j < pixelcount; j += 48)
{
for(int i = 0; i < 16; ++i, ++j)
{
// i is never used

//if you need x and y, here is how to compute them
x = j & mask
y = j >> bitshift
buf[j] = palette[(cls[x,y] + paletteShift)%255];

x = (j+16) & mask
y = (j+16) >> bitshift
buf[j+16] = palette[(cls[x,y] + paletteShift)%255];

x = (j+32) & mask
y = (j+32) >> bitshift
buf[j+32] = palette[(cls[x,y] + paletteShift)%255];

x = (j+48) & mask
y = (j+48) >> bitshift
buf[j+48] = palette[(cls[x,y] + paletteShift)%255];


This memory access method gave me a better speed increase than anything else(in C)

The moral here is, dont access big buffers in user memory sequentially
added on the 2007-10-10 06:56:55 by duffman duffman
Dam, I can't post code correctly
added on the 2007-10-10 06:58:16 by duffman duffman
I use the magic number 16 because
16 ints = 16 x 4bytes = 64 bytes = cache line size of my cpu

But I'm not sure this is the reason of the speed increase
added on the 2007-10-10 07:08:54 by duffman duffman
For the first program, in SDL you can use the function of the documentation to edit the surface.
But it's very slow.
I doubt that using sin tables would speed up anything, only opposite. At least, lookup tables slow down shaders.
added on the 2007-10-10 09:07:32 by imbusy imbusy
Allocating memory in a time-critical function called every frame isn't a good idea...
And can't SDL automagically convert to truecolor if you specify a 8-Bit surface with palette? I vaguely remember I saw some functions that did that. They might be optimized already, so at least that should work a bit quicker.
added on the 2007-10-10 10:13:15 by raer raer
isn't cls allocated once already? and sintables already calculate only one time? and couldnt you avoid these shifts and ands by taking propper loops, you can predict it by using multiples of 64 as texture size? sorry kiddin but it seems that this one can be improved a lot.. thinkin about 8 bit mode and palette cycling or making palette double sized (and doublicated) so you can avoid the and(%255) at each pixel.. there are more possibilities. i suppose.. ;) maybe i'm wrong..
added on the 2007-10-10 10:29:05 by mad mad
Quote:
I doubt that using sin tables would speed up anything, only opposite. At least, lookup tables slow down shaders.


Of course they would. I have tried once instead of using lookup tables for my regular plasmas to use math.sin per pixel and it was 3 times slower. Well that could be different for shaders I guess.

p.s. Hmm,. a plasma with sqrt? Never tried this equation, I wonder what kind of shape does it show!
added on the 2007-10-10 10:33:26 by Optimus Optimus
but these "news" hurt, are they really required here? something which is always (atleast mostly) a bad idea.
added on the 2007-10-10 10:42:46 by mad mad
sqrt... well, i guess it's fast on pc, lol ;)

the sin(sqrt(x*x+y*y)) shows circular ripples..

it's amazing that a plasma with sdl (or directx) is programmable in 30 seconds, flat. when i did my first plasma in '96 (or was it 95?) it cost me days doing all those luts and palette calcs in assembler. </oldfart>
added on the 2007-10-10 11:31:18 by earx earx
1. don't re-allocate colors[][] in every frame, just make it a class member variable like cls[][].
2. replace %255 by &255, because %255 is wrong and &255 is faster -- or eliminate it altogether, as mad suggested
3. don't use managed code :)
added on the 2007-10-10 11:34:25 by KeyJ KeyJ
people talking about bit manipulation to speed up a plasma effect on pc in 2007. who's trolling now?
thnx for the reactions
added on the 2007-10-10 17:08:28 by Marijn Marijn
it's not a big thing (perhaps even optimized away by the compiler) but you also don't have to do new Point(0, 0) every frame, just store it as a class member... and you could do the "if (cls == null)"-thing in the some function which is executed before the first drawing (one check less).
added on the 2007-10-10 17:36:52 by src src

login