Presentation is loading. Please wait.

Presentation is loading. Please wait.

Brook for GPUs. May 6, 20042 Status –Current efforts toward supporting Reservoir RStream compiler –Brook version 0.2 spec:

Similar presentations


Presentation on theme: "Brook for GPUs. May 6, 20042 Status –Current efforts toward supporting Reservoir RStream compiler –Brook version 0.2 spec:"— Presentation transcript:

1 Brook for GPUs

2 May 6, 20042 Status –Current efforts toward supporting Reservoir RStream compiler –Brook version 0.2 spec: http://merrimac.stanford.edu Brook GPU –Parallel effort to RStream compiler –Developed at Stanford –Brook for GPUs: http://brook.sourceforce.net Brook

3 May 6, 20043 stream-kernel program model unchanged kernel void foo (float a<>, float b<>, out float result<>) { result = a + b; } float a ; float b ; float c ; foo(a,b,c); some gpu-specific additions… for (i=0; i<100; i++) c[i] = a[i]+b[i]; Brook language

4 May 6, 20044 gather streams kernel void foo (float a<>, float b<>, float t, float array[], out float result<>) { result = array[a] + t*b; } float a ; float b ; float c ; float array foo(a,b,3.2f,array,c); gpu bonus Brook language

5 May 6, 20045 iterator streams kernel void foo (float a<>, float b<>, float t, float array[], iter float n<>, out float result<>) { result = array[a] + t*b + n; } float a ; float b ; float c ; float array iter float n = iter(0, 10); foo(a,b,3.2f,array,n,c); gpu bonus Brook language

6 May 6, 20046 stream operators –can be implemented with iterator and gather streams –streamRepeat & streamStride gpu-optimized kernel arguments of different shape kernel void foo (float a<>, float b<>, out float result<>); float a ; float b ; float c ; foo(a,b,c); Brook language foo(a[0], b[0], c[0]) foo(a[2], b[0], c[1]) foo(a[4], b[1], c[2]) foo(a[6], b[1], c[3]) foo(a[8], b[2], c[4]) foo(a[10], b[2], c[5]) foo(a[12], b[3], c[6]) foo(a[14], b[3], c[7]) foo(a[16], b[4], c[8]) foo(a[18], b[4], c[9])

7 May 6, 20047 system outline.br Brook source files brcc source to source compiler brt Brook run-time library

8 May 6, 20048 brcc compiler infrastructure based on ctool –http://ctool.sourceforge.net leverage vendor shader compilers –Microsoft: fxc –NVIDIA: cgc converting kernels into shaders –stream fetch and store –gather operations –register mapping –stub function

9 May 6, 20049 brcc compiler kernel compilation kernel void updatepos (float2 pos<>, float2 vel[100][100], float timestep, out float2 newpos<>) { newpos = pos + vel[pos]*timestep; } float4 main (uniform float4 _workspace : register (c0), uniform sampler _tex_pos : register (s0), float2 _tex_pos_pos : TEXCOORD0, uniform sampler vel : register (s1), uniform float4 vel_scalebias : register (c1), uniform float timestep : register (c2)) : COLOR0 { float4 _OUT; float2 pos; float2 newpos; pos = tex2D(_tex_pos, _tex_pos_pos).xy; newpos = pos + tex2D(vel,(pos).xy*vel_scalebias.xy+vel_scalebias.zw).xy * timestep; _OUT.x = newpos.x; _OUT.y = newpos.y; _OUT.z = newpos.y; _OUT.w = newpos.y; return _OUT; }

10 May 6, 200410 brcc compiler kernel compilation static const char __updatepos_ps20[] = "ps_2_0..... static const char __updatepos_fp30[] = "!!fp30..... void updatepos (const __BRTStream& pos, const __BRTStream& vel, const float timestep, const __BRTStream& newpos) { static const void *__updatepos_fp[] = { "fp30", __updatepos_fp30, "ps20", __updatepos_ps20, "cpu", (void *) __updatepos_cpu, "combine", 0, NULL, NULL }; static __BRTKernel k(__updatepos_fp); k->PushStream(pos); k->PushGatherStream(vel); k->PushConstant(timestep); k->PushOutput(newpos); k->Map(); }

11 May 6, 200411 brt runtime kernels kernel execution –issue screen size quad texture coords provide stream positions reductions –multi-pass method

12 May 6, 200412 applications ray-tracer fft segmentation linear algebra: –BLAS: SAXPY & SGEMV

13 May 6, 200413 Brook performance First generation GPUs compared against: Intel Math Library Atlas Math Library Cached-block segmentation FFTW SSE-opt Ray Triangle code ATI Radeon 9800 XT NVIDIA GeForceFX

14 May 6, 200414 Brook performance Second Generation GPUs compared against: Intel Math Library Atlas Math Library Cached blocked segmentation FFTW SSE-opt Ray Triangle code ATI Radeon X800 XT NVIDIA GeForce 6800

15 May 6, 200415 streaming ray tracing

16 May 6, 200416 streaming ray tracing Ray-Triangle rate CPU: 100M RT/sec Brook: 186M RT/sec

17 May 6, 200417 streaming ray tracing Ray Triangle Intersection kernel void krnIntersectTriangle(Ray ray<>, Triangle tris[], RayState oldraystate<>, GridTrilist trilist[], out Hit candidatehit<>) { float idx, det, inv_det; float3 edge1, edge2, pvec, tvec, qvec; if(oldraystate.state.y > 0) { idx = trilist[oldraystate.state.w].trinum; edge1 = tris[idx].v1 - tris[idx].v0; edge2 = tris[idx].v2 - tris[idx].v0; pvec = cross(ray.d, edge2); det = dot(edge1, pvec); inv_det = 1.0f/det; tvec = ray.o - tris[idx].v0; candidatehit.data.y = dot( tvec, pvec ) * inv_det; qvec = cross( tvec, edge1 ); candidatehit.data.z = dot( ray.d, qvec ) * inv_det; candidatehit.data.x = dot( edge2, qvec ) * inv_det; candidatehit.data.w = idx; } else { candidatehit.data = float4(0,0,0,-1); }

18 May 6, 200418 in the news… open source –http://brook.sourceforge.net –http://sourceforge.net/projects/brook over 5,600 downloads in 5 months 163K page hits GPGPU SIGGRAPH Course in the news –IEEE Computer –http://slashdot.org –http://opengl.org –http://gpgpu.org

19 May 6, 200419 current development Optimizations for second-gen gpus –efficient predication gpu feature: early z kill needed for fast ray walking –scatterOp: a[i] += p gpu feature: vertex textures build and manipulate data structures

20 May 6, 200420 graphics bindings multi-tasking the gpu –brook and graphics rendering stream data –API for binding stream textures // Create streams fluidStream0 = stream::create ( kFluidSize, kFluidSize ); normalStream = stream::create ( kFluidSize, kFluidSize ); // Get stream texture normalTexture = (IDirect3DTexture9*) normalStream->getIndexedFieldRenderData(0); // call kernel clearKernel( fluidStream0 );


Download ppt "Brook for GPUs. May 6, 20042 Status –Current efforts toward supporting Reservoir RStream compiler –Brook version 0.2 spec:"

Similar presentations


Ads by Google