Presentation is loading. Please wait.

Presentation is loading. Please wait.

ASH: A Substrate for Scalable Architectures Mihai Budiu Seth Copen Goldstein CALCM Seminar, March 19, 2002.

Similar presentations


Presentation on theme: "ASH: A Substrate for Scalable Architectures Mihai Budiu Seth Copen Goldstein CALCM Seminar, March 19, 2002."— Presentation transcript:

1

2 ASH: A Substrate for Scalable Architectures Mihai Budiu Seth Copen Goldstein http://www.cs.cmu.edu/~phoenix CALCM Seminar, March 19, 2002

3 /322 Resources

4 /323 CPU Problems Complexity Power Global Signals Limited issue window => limited ILP We propose an architecture with none of these limits

5 /324 Outline Scalability Reconfigurable hardware advantages A hybrid RH + CPU architecture CPU and RH as peers Application Specific Hardware

6 /325 FU * clock freq Computational Bandwidth CPU Unbounded RH * + / a=a+b b=b+c

7 /326 Registers Fixed RH Unbounded eax ebx ecx edx ijklmijklm spillsp[0] CPU

8 /327 Register Bandwidth Fixed CPU R1 R2 R3 W1 W2 RH Unbounded

9 /328 Out-of-Order Execution RHCPU Fetch Decode Dispatch Execute Commit In-order Limited by window Compiler’s window is unbounded

10 /329 Outline Scalability Reconfigurable hardware advantages A hybrid RH + CPU architecture CPU and RH as peers Application Specific Hardware

11 /3210 Hybrid system: CPU+RH High ILP application- specific Low ILP + OS + VM generic CPURH Memory Tight coupling

12 /3211 Problem HLL Program CPURH Memory Compiler

13 /3212 Our Solution General: applicable to today’s software Automatic: compiler-driven [RISC approach] Scalable: with clock, hardware and program size Parallelism: exploit application parallelism bit-level ILP pipeline loop-level

14 /3213 Outline Scalability Reconfigurable hardware advantages A hybrid RH + CPU architecture CPU and RH as peers Application Specific Hardware

15 /3214 Peering a( ) { b( ); } b( ) { c( ); } c( ) { d( ) } d( ) { } CPURH a b c d Program

16 /3215 marshalling, control transfer software procedure call hardware dependent RH “RPC” CPU a b c d b’ c’ d’ Stubs built automatically.

17 /3216 Stub Synthesis Procedures for RH RH Compiler Procedures for CPU Program Partitioning Stubs Configuration Linker Executable

18 /3217 Outline Scalability Reconfigurable hardware advantages A hybrid RH + CPU architecture CPU and RH as peers Application Specific Hardware

19 /3218 Application-Specific Hardware Reconfigurable hardware HLL program Compiler Circuit HLL Program CPURH Memory Compiler

20 /3219 CASH: Compiling for ASH Memory partitioning Interconnection net Circuits C Program RH

21 /3220 Asynchronous Computation + data ready ack Can extend to locally synchronous, globally asynchronous

22 /3221 Dataflow Graphs int plus(int x, int y) { return x + y; }

23 /3222 From Control Flow to Data Flow

24 /3223 From Control Flow to Data Flow

25 /3224 From Control Flow to Data Flow

26 /3225 Conditionals = Speculation int cond(int p, int x, int y) { int z; if (p) z = x; else z = y; return z; }

27 /3226 Critical Paths if (x > 0) y = -x; else y = b*x; * xb0 y ! ->

28 /3227 Executing Lenient Operators if (x > 0) y = -x; else y = b*x; * xb0 y ! -> Up to 40% performance improvement.

29 /3228 Pipelining PipelinedCycles N903 Y653

30 /3229 Loop Pipelining PipeFIFOCycles N0903 N1 Y0653 Y1474 Y2408 Y3

31 /3230 Loop Pipelining PipeFIFOCycles N0903 N1 Y0653 Y1474 Y2408 Y3

32 /3231 ASH Features What you code is what you get –no hidden control logic –really lean hardware (no CAM, decoders, multiported files, etc.) Compiler has complete control Dynamic scheduling => latency tolerant Naturally exploits ILP, even across loop iterations

33 /3232 Conclusions ASH = Compiler-synthesized hardware ASH matches program parallelism Dynamically scheduled RH ASH scales with –clock frequency –transistors –program size

34 /3233 Backup Slides

35 /3234 Reconfigurable Hardware Universal gates and/or storage elements Interconnection network Programmable switches

36 /3235 Switch controlled by a 1-bit RAM cell 00010001 Universal gate = RAM a0 a1 a0 a1 data a1 & a2 0 data in control Main RH Ingredient: RAM Cell

37 /3236 Stubs a( ) { r = b(b_args); } b(b_args) { } a( ) { r = b’(b_args); } b’(b_args) { send_rh(b_args); invoke_rh(b); r = receive_rh( ); return r; } RH Program

38 /3237 Independent of b Dispatcher Stubs a( ) { r = b(b_args); } b(b_args) { if (x) c( ); return r; } c( ) { } Program b’(b_args) { send_rh(b_args); invoke_rh(b); while (1) { com = get_rh_command( ); if (! com) break; (*com)( ); } r = receive_rh( ); return r; } c’s stub

39 /3238 C’s Stub a( ) { r = b(b_args); } b(b_args) { if (x) c( ); return r; } c( ) { } Program c’( ) { receive_rh(c_args); r = c(c_args); send_rh(r); invoke_rh(return_to_rh); } back

40 /3239 Input to Output int io(int x) { return x; }

41 /3240 Loops int loop() { int w = 10; while (w > 0) w--; return w; }

42 /3241 Pointers and Arrays int a[10]; void pointer(int *p) { a[2] += a[4] + *p; }

43 /3242 int sum() { int s = 0; int i; for (i=0; i < 10; i++) s += a[i]; return s; } Pointers and Loops


Download ppt "ASH: A Substrate for Scalable Architectures Mihai Budiu Seth Copen Goldstein CALCM Seminar, March 19, 2002."

Similar presentations


Ads by Google