Problem with RayMarching using DistanceFunctions when applying Twist-Distortion.

category: general [glöplog]
vertex, geometry, pixel shaders. what else?
added on the 2009-09-26 14:18:17 by xTr1m xTr1m
so in a year they will do less than today!?
Is someone being plain stupid or is it just my imagination?
A compute shader is a shader that allows you to compute anything, ANYTHING. Instead of being limited to processing vertices OR geometry OR fragments...
added on the 2009-09-26 14:54:11 by xTr1m xTr1m
i guess 4k would be possible, since it doesn't use anything fancy. i mean, no acceleration-structure and such. but i never tried 4k, so its gonna be a bit tricky, and my asm is a little rusty. maybe i go for 64k first ;)
added on the 2009-09-27 21:53:40 by xortdsc xortdsc
iq released a set of 4k/1k frameworks. Quite excellent!

added on the 2009-09-27 21:57:08 by Rob Rob
ah cool. will check it out...
added on the 2009-09-27 22:06:53 by xortdsc xortdsc
Speaking of shaders and whatnot: http://www.nvidia.com/object/gpu_technology_conference.html
anyone going? I'm packing my bag right now, the plane leaves in 8 hours ;)
added on the 2009-09-27 22:09:15 by kusma kusma
CUDA in 4k? iq you've tried that, is it small enough?

yes, I ported Slisesix to CUDA, it was still 4k. But I got no speed up compared to GLSL or HLSL. I remember I had to do some tricks to have CUDA working in 4k because of some static data initialization. If I remember well I had to modify the exe in hexadecimal to replace some address to have it working in Release. It was working well tho. I also remember to have the Mandelbrot set in 1k with CUDA.
added on the 2009-09-27 22:38:08 by iq iq
there was a 4k intro in nvScene08, btw
added on the 2009-09-27 22:38:34 by iq iq
(with CUDA I mean)
added on the 2009-09-27 22:38:47 by iq iq
I found this in my disc, although the code is >1 year old, so I have no clue what CUDA version I used at the time...:

Code: #include <windows.h> #include <cuda_runtime_api.h> #include <crt/host_runtime.h> #include <__cudaFatFormat.h> #include <vector_types.h> #define XRES 1024 #define YRES 768 static DEVMODE screenSettings = {{0},0,0,156,0,0x001c0000,{0},0,0,0,0,0,{0},0,32,XRES,YRES,{0},0, 0,0,0,0,0,0,0,0}; static const BITMAPINFO bmi = { {sizeof(BITMAPINFOHEADER),XRES,YRES,1,32,BI_RGB,0,0,0,0,0},{0,0,0,0} }; static unsigned char buffer[XRES*YRES*4]; extern "C" void cudaMain(uchar4 *dst, const float time); void entrypoint( void ) { HDC hDC = GetDC( CreateWindowEx(0,"static",0,WS_VISIBLE|WS_POPUP|WS_MAXIMIZE,0,0,0,0,0,0,0,0) ); uchar4 *devBuffer; cudaMalloc( (void**)&devBuffer, XRES*YRES*4 ); long to = timeGetTime(); do { float t = 0.001f*(float)(to-timeGetTime()); cudaMain( devBuffer, t ); cudaThreadSynchronize(); cudaMemcpy(buffer, devBuffer, XRES * YRES * sizeof(uchar4), cudaMemcpyDeviceToHost); StretchDIBits(hDC,0,0,XRES,YRES,0,0,XRES,YRES,buffer,&bmi,DIB_RGB_COLORS,SRCCOPY); }while( !GetAsyncKeyState(VK_ESCAPE)); cudaFree( devBuffer ); }

Code: // some functions go here... __global__ void Mandelbrot0_sm10(uchar4 *dst, const float time) { const int ix = blockDim.x * blockIdx.x + threadIdx.x; const int iy = blockDim.y * blockIdx.y + threadIdx.y; float x = -1.75f + 3.5f*(float)ix/(float)XRES; float y = -1.00f + 2.0f*(float)iy/(float)YRES; float an=time*0.15f; float si, co; sincosf(an,&si,&co); // lens distortion float r2 = x*x*0.32f + y*y; float tt = (7.0f-sqrt(37.5f-11.5f*r2))/(r2+1.0f); x*=tt; y*=tt; float3 rd = normalize(make_float3(x*co-si,y,co+x*si)); float3 ro = make_float3(0.5f+1.4f*si,0.5f,1.5f-1.4f*co); float t; int matID, sumMatID; float3 xyz, rgb; cast_ray( ro, rd, 0.5f, 30.0f, t, xyz, matID, sumMatID ); shade( rgb, xyz, rd, matID, sumMatID, t, make_float2(x,y) ); pass2d( rgb, x, y ); uchar4 color; color.x = __float2int_rn(255.0f*rgb.z); color.y = __float2int_rn(255.0f*rgb.y); color.z = __float2int_rn(255.0f*rgb.x); color.w = 255; dst[XRES*iy+ix] = color; } #define BLOCKDIM_X 8 #define BLOCKDIM_Y 8 extern "C" __host__ void cudaMain(uchar4 *dst, const float time) { dim3 threads(BLOCKDIM_X, BLOCKDIM_Y); dim3 grid(XRES/BLOCKDIM_X, YRES/BLOCKDIM_Y); Mandelbrot0_sm10<<<grid, threads>>>(dst, time); }
added on the 2009-09-27 22:50:18 by iq iq
xTr1m: Compute shader is still tied to (invoked per) cells of 3d grid, but you get many side-kicks like synchronization, shared memory (table accessible by every thread), atomic operations, scattered writes (to rendertargets too, IIRC) and output streams, most of which will work also in pixel shaders. It still probably won't be well sutied to do anything, but should get really close to it.
added on the 2009-09-27 23:51:30 by KK KK
KK: I know, sounds like CUDA. Still it feels more powerful, although it can be proven that pixel shaders and compute shaders can be equally capable of doing the same tasks. Let's just say compute shaders are a different approach of solving problems, with a broader selection of solution candidates. With a pixel shader, you know for sure you have to work with 2d fragment coordinates, uniforms and textures as inputs, writing to 2d output render targets. Talk about flexibility.
added on the 2009-09-28 00:43:49 by xTr1m xTr1m
Like CUDA, but unlike CUDA DX11 will be available out-of-the-box on compomachines. ;) As for "proving equally capable", I quite doubt it. Lack of scattered writes and variable output makes pixel shaders quite limited (eg. consider rendering 3D IFS fractals using pixel shaders only vs compute shaders).
added on the 2009-09-28 00:58:13 by KK KK

sorry to resurrect that thread, but there is something that i dont understand while playing around with raymarching and 3d questions. maybe someone who is experimented in this domain will help me.

basically it works like this :
for each point of the screen you throw a ray
for each step you check if the ray hit you object
if you hit something you get back and use smaller steps until is acceptable
when you detect the ray hit something how do you get the normal of the point that was hit ?

best should be a formula that can give this for any given 3d point of your object surface, for a sphere it is easy. but for more complex object nearly impossible. i was thinking about just shoot another ray near the first point then another one. then with the three points you have a triangle and can finaly compute your normals .... i'm i right?
added on the 2010-04-07 20:58:58 by Tigrou Tigrou
Usually with Distance Fields you don't need step 3 in your algorithm. You set a distance threshold and the intersections will automatically be as accurate as that threshold.

For the normals, you probably want to use a simple 6 point gradient approximation (a.k.a. central differences). more info here, but not a lot more.

In it's more simple form, the rendering happens like this.

Code: for( every pixel ) { // construct ray ro = cameraPosition; rd = raydirection( pixel ); // cast ray for( t=0; ; ) { pos = ro + t*rd; h = distField( pos ); if( h<0 ) break; t += h; } // compute normal nor = normalize( distField( pos+(e,0,0) ) - distField( pos-(e,0,0) ), distField( pos+(0,e,0) ) - distField( pos-(0,e,0) ), distField( pos+(0,0,e) ) - distField( pos-(0,0,e) ) ); // colorize the pixel color = shade( pos, nor ); }

added on the 2010-04-07 21:15:52 by iq iq
thx for that post iq
added on the 2010-04-07 21:33:29 by Tigrou Tigrou