pouët.net

guess what this function does

category: general [glöplog]
static inline uint32_t YeahImNotGonnaGiveYouTheNameOfCourse(uint32_t v)
{
uint32_t tmp = v - ((v >> 1) & 0x55555555);

tmp = (tmp & 0x33333333) + ((tmp >> 2) & 0x33333333);

uint32_t r = ((tmp + (tmp >> 4) & 0xF0F0F0F) * 0x1010101) >> 24;

return r;
}

BTW I didn't come up with this
added on the 2008-12-23 22:44:55 by duffman duffman
i'll let ryg do this, he's into this kinky stuff.
added on the 2008-12-23 22:48:18 by Gargaj Gargaj
That function counts the number of set bits in a value.
added on the 2008-12-23 22:49:23 by xernobyl xernobyl
wild stab in the dark, its one of those interview questions about counting the number of 1 bits ? There are millions of variations but this looks like one of them.
added on the 2008-12-23 22:49:43 by auld auld
http://gurmeetsingh.wordpress.com/2008/08/05/fast-bit-counting-routines/

I must say that __builtin_popcount in GNU was new to me.

Anyway, yeah, one of the most pointless things to know.
added on the 2008-12-23 23:00:59 by snoutmate snoutmate
SSE4.2 has POPCNT
added on the 2008-12-23 23:33:19 by misioslaw misioslaw
....aaaand the function is taken from this site.
added on the 2008-12-24 00:26:48 by kusma kusma
How long till Adok finds this thread and tries to impress everyone?
added on the 2008-12-24 00:31:01 by Shockwave Shockwave
Counting bits isn't useless.
added on the 2008-12-24 00:38:14 by doomdoom doomdoom
yes, i'm into this kind of stuff, and yes, it's population count. oooold!

i'll shoot a few back that should be a bit more fun:
Code:static inline uint32 mystery1(uint32 a,uint32 b) { return (a & b) + (((a ^ b) >> 1) & 0x7f7f7f7f); } static inline uint32 mystery2(uint32 a,uint32 b) { return ((a & ~b) & 0x80808080u) ^ ((a & 0x7f7f7f7f) + (b & 0x7f7f7f7f)); }

(both pixel manipulation stuff)

the advantage of this kind of tricks is (or well, was) that they are easy to generalize to "non-homogenous" pixel formats (e.g. rgb565), where you can't simply use mmx instructions :)

though my favorite one by far is this (i've seen it first described by blinn):
Code:int mul255(int a,int b) // =round(a*b/255) for a,b small enough (-255<=a,b<=255 always works) { int t = a*b + 128; return (t + (t>>8)) >> 8; }


as for bit tricks that are actually useful, the "chunky to planar-esque" bitwise-transpose tricks often come in handy in SIMD code. even though you really only need the key insight (decomposing a matrix transpose into several block matrix transposes) and then can do the rest with pack/unpack/shift instructions (without the xor trickery).

time for link galore!
http://aggregate.org/MAGIC/
http://graphics.stanford.edu/~seander/bithacks.html
http://home.pipeline.com/~hbaker1/hakmem/hakmem.html
http://realtimecollisiondetection.net/blog/?p=90
http://realtimecollisiondetection.net/blog/?p=78
and knuth has a pretty good (and systematic) collection too.

and with that, i shall leave you for the evening. merry christmas.

(this post was brought to you by the number 2 and the letter pi).
added on the 2008-12-24 00:46:52 by ryg ryg
The first one looks like it is an average of two 4 component colours (0..255) packed into a and b. If true, thats a super cool trick.

I cant get the second one. At a guess it does the same for signed values?
added on the 2008-12-24 02:25:11 by auld auld
poops on a baby
Adok need not impress anyone. :)

ryg: Write something for Hugi!
added on the 2008-12-24 10:43:34 by Adok Adok
merry christmas to you too.
added on the 2008-12-24 11:10:49 by rudi rudi
duffman: yeah, nice bit counter. it proceeds from fine grain to coarse, that was easy. but the first step.. i wouldn't have thought of that.
added on the 2008-12-24 11:28:04 by earx earx
ryg:
..and since they will release everything without further ado you really can send him anything... GUTEN RRRRRUTSCH, adok!
added on the 2008-12-24 11:30:49 by v3nom v3nom
ryg: what i always did was:

return (a>>1)&0x7f7f7f7f + (b>>1)&0x7f7f7f7f;

it wasn't 4-component pixel manip. but for water/fire effects. it's nice that your version would probably be faster on some implementations (a shift traded against an ALU operation).
added on the 2008-12-24 11:41:19 by earx earx
correction: with implementations i mean cpu architectures.
added on the 2008-12-24 11:42:16 by earx earx
earxmas: The difference being that yours has some undesirable rounding behaviour. E.g. the average of 0xff and 0xff becomes 0xfe. Oh and you need some more brackets. :)
added on the 2008-12-24 13:27:16 by doomdoom doomdoom
xmas device: yeah, that's true.

.. come to think of it, i probably did shift and mask afterwards, which made it alot faster.
added on the 2008-12-24 14:21:03 by earx earx
Guess what this is..

*hint* It has something to do with 565 packed pixels...

Code: unsigned int inl_WtfPixelOp(unsigned int a, unsigned int b) { unsigned int sum = a+b; unsigned int mask = ((sum & 0x08010020)>>5)*63; return (sum | mask) & 0x07E0F81F; }
added on the 2008-12-24 14:23:34 by torus torus
my second function was simply (a+b-128) without saturation.

torus: saturating add. green is in the top 16bits, rb in the low 16bits.
added on the 2008-12-24 14:46:25 by ryg ryg
v3nomsoup: "Guten Rutsch" - same to you! But ryg's contributions could even be published uncontrolled. He writes only politically correct articles. :)
added on the 2008-12-24 14:49:14 by Adok Adok
ryg, that's it.. saturating add.

btw - on the C64x+ DSP I'm using the following one:

Code: unsigned int inl_SatAdd565 (ui16 a, ui16 b) { // for C64x+ DSP unsigned int mask = ((15<<11)|(31<<5)|15)<<16; unsigned int temp = _deal (_shfl (a) + _shfl (b | mask)); unsigned int color = temp; unsigned int carry = temp; if (carry & 0x80000000) color |= 0xf800; if (carry & 0x04000000) color |= 0x07e0; if (carry & 0x00100000) color |= 0x001f; return color; }


_shfl interleaves an integer bit-wise with 0-bits (e.g. Pow (x,2) over the Galious Field 2). _deal does more or less the opposite (sorts even bits into MSB16 and odd bits into LSB16).

It took me quite a while to come up with this one. It runs roughly twice as fast as the version I've posted above and does not need interleaving of the color-channels.
added on the 2008-12-24 15:01:59 by torus torus
Quote:
ryg's contributions could even be published uncontrolled. He writes only politically correct articles. :)

ryg, you know what to do.
added on the 2008-12-24 15:22:12 by Gargaj Gargaj

login