Presentation is loading. Please wait.

Presentation is loading. Please wait.

Approaching Ideal NoC Latency with Pre-Configured Routes George Michelogiannakis, Dionisios Pnevmatikatos and Manolis Katevenis Institute of Computer Science.

Similar presentations


Presentation on theme: "Approaching Ideal NoC Latency with Pre-Configured Routes George Michelogiannakis, Dionisios Pnevmatikatos and Manolis Katevenis Institute of Computer Science."— Presentation transcript:

1 Approaching Ideal NoC Latency with Pre-Configured Routes George Michelogiannakis, Dionisios Pnevmatikatos and Manolis Katevenis Institute of Computer Science (ICS) Foundation for Research & Technology - Hellas (FORTH) P.O.Box 1385, Heraklion, Crete, GR-711-10 GREECE Email: {mihelog,pnevmati,kateveni}@ics.forth.gr

2 May 2007ICS-FORTH, Greece2 Introduction Problem: Latency NoCs impose. Motivation: Latency introduced to every communication pair. Past work: Achieves 1 cycle/hop at 500 MHz. We extend speculation to routing decisions. Goal: Approach buffered wire latency. Fraction of cycle/hop.

3 May 2007ICS-FORTH, Greece3 Our Approach 400 ps good scenario; 1 cycle otherwise. 130 nm library

4 May 2007ICS-FORTH, Greece4 Preferred Paths Each output has one preferred input. This pref. I/O pair is connected by a single pre-enabled tri-state driver. Pre-enabling is crucial: 200 ps pre-enabled mux; 500 ps otherwise. Later check if flits correctly forwarded. Thus, preferred paths are formed. Reconfigurable at run-time. Custom routes (shapes) allowed.

5 May 2007ICS-FORTH, Greece5 Switch Architecture - Output 400 ps 1 cycle Input FIFOs. Selectable when non-empty, or flit to be enqueued. Pref. path pre- enabled tri-states. Routing logic tri-state. Config. & arbitration logic. Stores pref. path config. & arbitrates.

6 May 2007ICS-FORTH, Greece6 Switch Architecture - Input Dead flits: Incorrectly eagerly forwarded. Terminated at end of preferred path. Switch resembles a buffered crossbar. Decides if flit needs to be enqueued.

7 May 2007ICS-FORTH, Greece7 Routing Algorithm Deterministic routing employed. Non-preferred paths follow XY routing. We slightly modify XY routing to handle preferred paths: Flit correctly eagerly forwarded if it approaches the destination in any axis. Flit considered dead otherwise.

8 May 2007ICS-FORTH, Greece8 Routing Characteristics Flits in preferred paths may not follow XY routing. Duplicate copies of a flit may be delivered. XY routing. Pref. paths. D S

9 May 2007ICS-FORTH, Greece9 NoC Topology – Bar Floorplan Application: Tiled CPU and RAM blocks. Each switch is 6x6 and serves 4 PEs.

10 May 2007ICS-FORTH, Greece10 Bar Floorplan Would be 8x12: Vertical links drive address inputs. 2 PE data ports served by 1 switch port.

11 May 2007ICS-FORTH, Greece11 Cross Floorplan

12 May 2007ICS-FORTH, Greece12 Layout Results 130 nm implementation library. Typical case. Pref. path latency: 300-420 ps. 450-500 ps (incl. 1mm). 1 cycle/node otherwise. Past work: 1 cycle/node at 500 MHz. Clock frequency667 MHz Flit width39 FIFO lines2 Number of FIFOs30 Bar area overhead13% Cross area overhead18% Number of cells15 K Number of gates45 K Total dynamic power80 mW

13 May 2007ICS-FORTH, Greece13 Advanced Issues Deadlock & livelock freedom. Constraints to prevent circle. Keep NoC functional in any case. Out-of-order delivery of flits in the same packet. Apply reconfiguration at a “safe” time. Adaptive routing.

14 May 2007ICS-FORTH, Greece14 Future Work Synchronization issues – A flit may arrive at any time. Impose preferred path constraints. Implement switch asynchronously. Evaluation in complete system. Implement fault-tolerance.

15 May 2007ICS-FORTH, Greece15 Conclusion We approach ideal latency. By pre-enabled tri-state paths. Our NoC is a generalized “mad- postman” [C. R. Jesshope et al, 1989]. Our NoC is easily generalized – topology may need to be changed. Past NoC research can be applied for further optimizations.

16 May 2007ICS-FORTH, Greece16 Related Work Most assumed 2D mesh-like topologies. Reconfigurable topologies studied. Various performance enhancement techniques studied. They achieve 1 clock cycle/node at approx. 500 Mhz. Various routing algorithms studied. Recent field: fault-tolerant techniques.

17 May 2007ICS-FORTH, Greece17 Backpressure FIFO almost full Previous hop’s feeding output alerted. If fed by a preferred path in the previous hop Flits are also enqueued. Preferred path may or may not be broken.

18 May 2007ICS-FORTH, Greece18 Mad Postman XY routing. Eagerly forward incoming flits to the same axis. Later examine if correctly forwarded. Terminate “dead” flits in later hops. MSG Penalty Source Destination

19 May 2007ICS-FORTH, Greece19 NoC Topology Application: Tiled CPU and RAM blocks. Each switch is 6x6 and serves 4 PEs. Would be 8x12 without compromises: Vertical links also drive RAM address in. Two PE data ports are served by a single switch port: Data from one PE:


Download ppt "Approaching Ideal NoC Latency with Pre-Configured Routes George Michelogiannakis, Dionisios Pnevmatikatos and Manolis Katevenis Institute of Computer Science."

Similar presentations


Ads by Google