Presentation is loading. Please wait.

Presentation is loading. Please wait.

Many-Core Programming with GRAMPS Jeremy Sugerman Kayvon Fatahalian Solomon Boulos Kurt Akeley Pat Hanrahan.

Similar presentations


Presentation on theme: "Many-Core Programming with GRAMPS Jeremy Sugerman Kayvon Fatahalian Solomon Boulos Kurt Akeley Pat Hanrahan."— Presentation transcript:

1 Many-Core Programming with GRAMPS Jeremy Sugerman Kayvon Fatahalian Solomon Boulos Kurt Akeley Pat Hanrahan

2 2 Problem Statement  Facilitate efficient development and execution in many-/multi-core commodity systems.  Homogeneous or heterogeneous cores. Status Quo:  GPUs: Easy to write GL/D3D and run it fast, hard to express anything else  CPUs: Possible (not easy) to write anything, possible (hard) to run it fast

3 3 GRAMPS Background  Resembles a GPU with software constructed pipeline.  Not (too) radical even in a pure graphics context  Similar story saw fixed -> programmable shading  Now the pipeline topology is under analogous pressures: proliferation of stages and options  And graphics is more than a GL/D3D pipeline…  And throughput / many-core is more than graphics…

4 4 GRAMPS Programming Model  Software constructs the pipeline (actually graph)  Exposes threads, shaders, fixed function stages –Coprocessors exposed via ISA  Exposes FIFOs / Queues connecting stages  Also enables software push / re-sorting  Exposes Buffers for memory access

5 5 GRAMPS’ Place  Compared to GPU Pipeline: More things possible (and medium easy), still (mostly) runs fast, less hardware independent  Compared to CPU: Easier to write things, easier to run them well, some loss of expressivity and flexibility  Still a role for a ‘graphics pipeline’. It’s an app!  GRAMPS is a layer, model for state machines.

6 6 GRAMPS and Streaming  From some angles, GRAMPS sounds a lot like Stream Processing / Computing  Distinctions are most visible in the target traits.  Streaming expects predictable data creation, flow, and consumption. Intensive offline / compile-time optimization and pre-scheduling.  GRAMPS expects dynamic data-dependent execution, (and thus) run-time scheduling  Also, GRAMPS assumes commodity and heterogeneity.

7 GRAMPS Examples Rast Shade FB Blend Frame Buffer Input Fragment Queue Output Fragment Queue Camera Intersect FB Blend Frame Buffer Ray Queue Sample Queue Shade Pixel Queue Rasterization Pipeline Ray Tracing Pipeline

8 8 GRAMPS Overview  Concepts: Graphs Stages: thread, shader, fixed-function Queues: ordered, unordered, sets (exclusion) Buffers  Components APIs: setup/driver, thread, shader Scheduler: fat core, shader core, top-level

9 9 What We’ve Built  Three rendering pipelines: Direct3D, Packet Tracer, D3D + Push (Hybrid)  Simulator and Runtime for two machines: GPU-like: Many threads per core, hw sched CPU-like: Few threads per core, sw sched

10 10 Rendering Pipelines Direct3D Pipeline (with Ray-tracing Extension) IA 1 VS 1 RO Rast Trace IA N VS N PS Frame Buffer Vertex Buffers Sample Queue Set Ray Queue Primitive Queue Input Vertex Queue 1 Primitive Queue 1 Input Vertex Queue N … … Ray-tracing Pipeline Tiler Sampler CameraIntersect Shade FB Blend Frame Buffer Sample Queue Tile Queue Ray Queue Ray Hit Queue Fragment Queue = Thread Stage = Shader Stage = Fixed-func Stage = Queue = Output via Push OM PS2 Fragment Queue = Stage Output Ray Hit Queue Ray-tracing Extension Primitive Queue N

11 11 Initial Results  Measured thread occupancy, worst case total queue memory.

12 12 GRAMPS Vis

13 13 High-level Challenges  Is GRAMPS a suitable GPU evolution? –Enable pipeline competitive with bare metal? –Enable innovation: advanced / alternative methods? –Is there a ‘best’ graphics pipeline on top?  Is GRAMPS a good parallel compute model? –Map well to hardware, hardware trends? –Support important apps? –Concepts influence developers?

14 14 What’s Next?  Low level implementation: scheduling, more accurate simulation.  More apps: REYES, physics, likely more.  Audit and refine model: graph modification / state change, fork-join / blocking calls, locks / barriers / synchronization primitives intra- or inter-stage  Prototype, explore next generation graphics pipelines.


Download ppt "Many-Core Programming with GRAMPS Jeremy Sugerman Kayvon Fatahalian Solomon Boulos Kurt Akeley Pat Hanrahan."

Similar presentations


Ads by Google