chunky to planar

category: code [glöplog]

Well, I fail to see what is new - apart from using fewer bitplanes.

added on the 2006-09-28 19:48:01 by Stelthzje Stelthzje
Oswald: OooO_ooh.. The hate is swelling in you now.

Really, C64 demos are lame. It's like they all use the same palette, and only like 16 cols or summin. Like a little bit of originality would kill you. I also think if the C64 coders knew more about cache optimization, that'd help a lot. And I hate how AHX is so popular on the C64. I mean, it's a nice sound and all, but really, it's been done too much already.
added on the 2006-09-28 20:06:03 by doomdoom doomdoom
Stelthz: The speed! And the color adjust prediction trick for the second pixel is new. In the ludde c2p a Spritescreen was used to mask an ugly black pixel mask, I use the 5th bitplane.
The cpu pass is actually just or'ing the nibbles together and not a real merge.. Ludde's 1995 c2p had around 30 instructions pr longwordwrite.
(old c2p merge technology and "real merges").

This c2p loop can be unrolled for faster results.

movem.l (a0)+,d0-d1
lsl.l #4,d0
lsl.l #4,d1
or.l (a0)+,d0
or.l (a0)+,d1
movem.l d0-d1,(a1)+

cmpa.l a2,a1
bne.b .swap4c2p

To make an even faster c2p, one blitter pass can be removed by rearranging the byteorder of the chunkybuffer (scrambling)
added on the 2006-09-28 21:06:31 by sp^ctz sp^ctz
I had to do some modifications.. This might be done faster on the Mc68000 by moving words from the chunkybuffer. 14 instructions for 16 pixels is still pretty good.

Code CPU pass. not optimized. (Not pipelined for 020+)


movem.l (a0)+,d0-d3

move.l d1,d4 ;swap16 (1X2) (3X4)
move.l d2,d5
move.w d0,d4
move.w d3,d5
swap d4
swap d5
move.w d4,d0 ;1
move.w d5,d2 ;3
move.w d1,d4 ;2
move.w d3,d5 ;4

lsl.l #4,d0 ;Or nibbles (swap4)
lsl.l #4,d2
or.l d0,d4
or.l d2,d5

movem.l d4-d5,(a1)+

dbf d7,.swap4c2p
added on the 2006-09-28 23:32:48 by sp^ctz sp^ctz
movem.l d4-d5,(a1)+

I'd like to see you doing that. :D
added on the 2006-09-28 23:36:08 by StingRay StingRay
re: Stingray movem.l d4/d5,(a1)+ ? or 2 moves.
This might be a faster Mc68000 loop that will do the same thing as the code above.


move.w (a0)+,d0
lsl.w #4,d0
or.w (a0)+,d0
move.w d0,(a1)+


dbf .loop
added on the 2006-09-28 23:45:39 by sp^ctz sp^ctz
hehe, I checked now. only -(a1) is legal.. So two moves
added on the 2006-09-28 23:51:46 by sp^ctz sp^ctz
Exactly. :D
added on the 2006-09-28 23:54:52 by StingRay StingRay
Fullscreen 160*128 c2p timings (winuae match a500 speed)
Timed with CIA timer. I need to run the test on a real a500. anyone?

Scrambled loop: 215 rasterlines (first suggestion)
Longword loop: 282 raterlines
wordloop: 354 rastelines
added on the 2006-09-29 00:45:12 by sp^ctz sp^ctz
I can test it on one of my A500's here if you like?
added on the 2006-09-29 01:16:41 by StingRay StingRay
sp, these measures are for only the c2p, or the combined routine + c2p? I should dig out my sources from trashcan 3 intro to "rediscover" how fast my c2p'ing was back then...

I recall the 160x100 size 2x2 res 2 bpl rotozoomer ran at 25fps...

doom: I reassure the opinion that we are entitled to * H * A * T * E * you for the next 5 years for reinserting the c2p drama into our scene :P
added on the 2006-09-29 11:41:39 by winden winden
sp+stingray, something else to test... i recall a real a500 ran a bit slower when upping from 4bpl to 5bpl, and then really a lot slower when upping from 5bpl to 6bpl... so if UAE is not doing cycle-exact chipmem-bus emulation, it will definitely change the timing on the real machine...
added on the 2006-09-29 11:51:11 by winden winden
Winden: exactly! :)
added on the 2006-09-29 12:34:13 by StingRay StingRay
Doom, you are an ugly troll.
added on the 2006-09-29 14:31:30 by Oswald Oswald
Really interesant thread. Now I understand how a simple A500 could do that fast flat polyfillers for few colors...
added on the 2006-10-02 12:06:59 by texel texel

Good reason to code another A500 killer demo. Its too bad so few people do. A500 would deserve to live as a demo platform. Its much better defined than some crazy PPC amiga with user numbers with two digits.

added on the 2006-10-02 17:52:10 by Stelthzje Stelthzje
texel: hooray for xor fillers.

(and no, that's not an amiga invention, this technique is pretty old :)
added on the 2006-10-02 18:07:03 by ryg ryg
Well, it's been another 5 years, why not bump it ?
So, since then, what's the state of the art of c2p ? I just discovered the amiga cd32 had it in hardware, but everyone says it's useless. Then @lx bumped me to this thread, sorry.
added on the 2011-01-31 13:13:29 by MsK` MsK`
...I'm still doing my stuff on bitplanes and pestering the custom chips ... and I'm happy with it
added on the 2011-01-31 13:32:11 by d0DgE d0DgE
Thread revival ftw! :)

My approach is "Another intro, another C2P". The interesting part is not as much the C2P itself as the things that are mixed into it to fill the cycles otherwise wasted waiting for chip writes to complete.

To give some examples, here's what I have merged into the C2P in some of my intros:

Ikanim: Motion blur
Noxie: 4-point radial or axial blur, dithering
Planet Loonies: Antialiased downscaling
Rapo Diablo 5000: Color interpolation and environment mapping
Luminagia: Dithering (interleaved with rendering per scanline; final pass and expansion to double width done by the blitter)
Ikadalawampu: Clamping, dithering

Finding something suitable to interleave with the C2P is the heart of "modern" Amiga coding IMO.
added on the 2011-01-31 14:46:09 by Blueberry Blueberry
msk, read my answer on demoscene.fr, and the links to ada's forum :)
added on the 2011-01-31 16:18:49 by krabob krabob
by the way, blueberry, I can't find the thread on ada's forum where windden gave that clamping trick with no tests on integers ... where is that ?
added on the 2011-01-31 16:22:14 by krabob krabob
"modern" Amiga coding *chuckles*
added on the 2011-01-31 17:13:51 by kusma kusma
@krabob: IIRC it's hidden somewhere here:
added on the 2011-01-31 17:23:33 by baah baah
krabob: Are you thinking of this thread? This was indeed the thread that gave me the idea for the clamping in Ikadalawampu, which in turn inspired me to the whole rendering engine. :)
added on the 2011-01-31 22:50:01 by Blueberry Blueberry