Presentation is loading. Please wait.

Presentation is loading. Please wait.

Peer-to-peer Hardware-Software Interfaces for Reconfigurable Fabrics Mihai Budiu Mahim Mishra Ashwin Bharambe Seth Copen Goldstein Carnegie Mellon University.

Similar presentations


Presentation on theme: "Peer-to-peer Hardware-Software Interfaces for Reconfigurable Fabrics Mihai Budiu Mahim Mishra Ashwin Bharambe Seth Copen Goldstein Carnegie Mellon University."— Presentation transcript:

1 Peer-to-peer Hardware-Software Interfaces for Reconfigurable Fabrics Mihai Budiu Mahim Mishra Ashwin Bharambe Seth Copen Goldstein Carnegie Mellon University

2 Peer-to-peer hw/sw interfaces Reconfigurable Hardware CacheLogic Resources Galore 20022007

3 Peer-to-peer hw/sw interfaces Fixed Why RH: Computational Bandwidth CPU “Unbounded” RH

4 Peer-to-peer hw/sw interfaces Partition Application C ProgramHDL CADCompiler OS support communication Using RH Today

5 Peer-to-peer hw/sw interfaces Computer System Tomorrow high-ILP computation low-ILP computation + OS + VM CPURH Memory Tight coupling

6 Peer-to-peer hw/sw interfaces This Work HLL Program Partitioning We suggest a high-level mechanism (not a policy). CPURH Memory ccCAD

7 Peer-to-peer hw/sw interfaces Outline Motivation Interfacing RH & CPU Opportunities Conclusions

8 Peer-to-peer hw/sw interfaces Premises RH is large –can implement large program fragments RH can access memory –does not require CPU support to access data –coherent memory view with CPU RH seen through clean abstraction –interface portability

9 Peer-to-peer hw/sw interfaces Unit of Partitioning: Procedure library leaves recursive hot spot high ILP Program call-graph:

10 Peer-to-peer hw/sw interfaces Production-Quality Software int foo(….) { highly parallel computation; …. if (!r) { fprintf(stderr, “Unexpected input”); return E_BADIN; } …. }

11 Peer-to-peer hw/sw interfaces Peering a( ) { b( ); } b( ) { c( ); } c( ) { d( ) } d( ) { } Program CPURH a b c d

12 Peer-to-peer hw/sw interfaces marshalling, control transfer Stubs software procedure call hardware dependent RH “RPC” CPU a b c d b’ c’ d’

13 Peer-to-peer hw/sw interfaces RH a( ) { r = b’(b_args); } b’(b_args) { } CPU b Stubs a( ) { r = b(b_args); } b(b_args) { } Program send_rh(b_args); invoke_rh(b); r = receive_rh( ); return r;

14 Peer-to-peer hw/sw interfaces Required Stubs 1 stub to call each RH procedure 1 stub for each procedure called by RH CPURH

15 Peer-to-peer hw/sw interfaces policy Compiling Procedures for RH Synthesis Procedures for CPU Program Partitioning Stubs Configuration Linker Executable automatic HLL to HDL

16 Peer-to-peer hw/sw interfaces Outline Motivation Interfacing RH & CPU Opportunities Conclusions

17 Peer-to-peer hw/sw interfaces Evaluation How much can be mapped to RH? SpecInt95 & Mediabench Partition strictly on procedure boundaries Limit RH to 10 6 bit-operations

18 Peer-to-peer hw/sw interfaces Coverage a( ) { b( ); } b( ) { c( ); } c( ) {} On RH Method1Method2 N N YY Y N 40%75% Total 100% 40% 35% 25% Running Time

19 Peer-to-peer hw/sw interfaces Coverage a( ) { b( ); } b( ) { c( ); } c( ) {} Running Time 40% 35% 25% On RH Method1Method2 N N YY N Y 25%65% Total 100%

20 Peer-to-peer hw/sw interfaces Policies leaves on RH RH X CPU arbitrary

21 Peer-to-peer hw/sw interfaces RH Stack Models Locals in registers f() { int local; g(&local); } Locals statically allocated f(x) { return x+1; } f(x) { f(x+1); } Dynamic stack

22 Peer-to-peer hw/sw interfaces Potential RH Coverage: SpecINT95 % Running time leaves CPU->RH CPU->RH->CPU dynamic stack static stack frames no stack

23 Peer-to-peer hw/sw interfaces Potential RH Coverage: Mediabench dynamic stack static stack frames no stack leaves CPU->RH CPU->RH->CPU

24 Peer-to-peer hw/sw interfaces Conclusions Stubs make RH/CPU interface transparent Stubs are automatically generated RH and CPU as peers RH/CPU interface: (remote) procedure call RPC used for control transfer (not data) Peering gives partitioner freedom

25 Peer-to-peer hw/sw interfaces The End

26 Peer-to-peer hw/sw interfaces

27 Independent of b Dispatcher Stubs a( ) { r = b(b_args); } b(b_args) { if (x) c( ); return r; } c( ) { } Program b’(b_args) { send_rh(b_args); invoke_rh(b); while (1) { com = get_rh_command( ); if (! com) break; (*com)( ); } r = receive_rh( ); return r; } c’s stub

28 Peer-to-peer hw/sw interfaces C’s Stub a( ) { r = b(b_args); } b(b_args) { if (x) c( ); return r; } c( ) { } Program c’( ) { receive_rh(c_args); r = c(c_args); send_rh(r); invoke_rh(return_to_rh); } back

29 Peer-to-peer hw/sw interfaces Attempt 1 Manual partitioning Interface: ad hoc Ex: OneChip, NAPA, PAM Advantage: huge speed-ups Problem: very hard work RH Program

30 Peer-to-peer hw/sw interfaces Attempt 2 Select small computations Interface: RH = functional unit Ex: PRISC, Chimaera Advantage: easy to automate Problem: low speed-up + >> Program + >> *

31 Peer-to-peer hw/sw interfaces Attempt 3 while (b) { b[ j+5]; } Select loop body Deeply pipelined implementation No memory access Interface: I/O or Functional Unit or Coprocessor Ex: PipeRench Advantage: very high speed-up Problems: cannot be automated loop-carried dependences few opportunities Program

32 Peer-to-peer hw/sw interfaces Attempt 4 Select whole loop Pipelined implementation Autonomous memory access Interface: coprocessor Ex: GARP Advantage: many opportunities Problems: complicated algorithm requires exceptional loop exits while (b) { if (error) printf(“err”); a[x] = y; } Program


Download ppt "Peer-to-peer Hardware-Software Interfaces for Reconfigurable Fabrics Mihai Budiu Mahim Mishra Ashwin Bharambe Seth Copen Goldstein Carnegie Mellon University."

Similar presentations


Ads by Google