The Alpha 21364 Network Architecture Mukherjee, Bannon, Lang, Spink, and Webb Summary Slides by Fred Bower ECE 259, Spring 2004.

Slides:



Advertisements
Similar presentations
1 SpaceWire Router ASIC Steve Parkes, Chris McClements Space Technology Centre, University of Dundee Gerald Kempf, Christian Toegel Austrian Aerospace.
Advertisements

What is Flow Control ? Flow Control determines how a network resources, such as channel bandwidth, buffer capacity and control state are allocated to packet.
Allocator Implementations for Network-on-Chip Routers Daniel U. Becker and William J. Dally Concurrent VLSI Architecture Group Stanford University.
Miguel Gorgues, Dong Xiang, Jose Flich, Zhigang Yu and Jose Duato Uni. Politecnica de Valencia, Spain School of Software, Tsinghua University, China, Achieving.
1 Lecture 12: Interconnection Networks Topics: dimension/arity, routing, deadlock, flow control.
4-1 Network layer r transport segment from sending to receiving host r on sending side encapsulates segments into datagrams r on rcving side, delivers.
Chapter 4 Network Layer slides are modified from J. Kurose & K. Ross CPE 400 / 600 Computer Communication Networks Lecture 14.
1 Lecture 16: On-Chip Networks Today: on-chip networks background.
10 - Network Layer. Network layer r transport segment from sending to receiving host r on sending side encapsulates segments into datagrams r on rcving.
CSCI 8150 Advanced Computer Architecture
Lei Wang, Yuho Jin, Hyungjun Kim and Eun Jung Kim
1 Lecture 13: Interconnection Networks Topics: flow control, router pipelines, case studies.
1 Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control Final exam reminders:  Plan well – attempt every question.
Issues in System-Level Direct Networks Jason D. Bakos.
1 Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control.
1 Lecture 25: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) Review session,
Low-Latency Virtual-Channel Routers for On-Chip Networks Robert Mullins, Andrew West, Simon Moore Presented by Sailesh Kumar.
Hash, Don’t Cache: Fast Packet Forwarding for Enterprise Edge Routers Minlan Yu Princeton University Joint work with Jennifer.
Performance and Power Efficient On-Chip Communication Using Adaptive Virtual Point-to-Point Connections M. Modarressi, H. Sarbazi-Azad, and A. Tavakkol.
Switching, routing, and flow control in interconnection networks.
On-Chip Networks and Testing
High-Performance Networks for Dataflow Architectures Pravin Bhat Andrew Putnam.
Networks-on-Chips (NoCs) Basics
Déjà Vu Switching for Multiplane NoCs NOCS’12 University of Pittsburgh Ahmed Abousamra Rami MelhemAlex Jones.
The Alpha Network Architecture By Shubhendu S. Mukherjee, Peter Bannon Steven Lang, Aaron Spink, and David Webb Compaq Computer Corporation Presented.
QoS Support in High-Speed, Wormhole Routing Networks Mario Gerla, B. Kannan, Bruce Kwan, Prasasth Palanti,Simon Walton.
DUKE UNIVERSITY Self-Tuned Congestion Control for Multiprocessor Networks Shubhendu S. Mukherjee VSSAD, Alpha Development Group.
Shubhendu S. Mukherjee, Peter Bannon, Steven Lang, Aaron Spink, and David Webb Alpha Development Group, Compaq HOT Interconnects 9 (2001) Presented by.
Deadlock CEG 4131 Computer Architecture III Miodrag Bolic.
ECE669 L21: Routing April 15, 2004 ECE 669 Parallel Computer Architecture Lecture 21 Routing.
Piranha: A Scalable Architecture Based on Single-Chip Multiprocessing Barroso, Gharachorloo, McNamara, et. Al Proceedings of the 27 th Annual ISCA, June.
Packet Forwarding. A router has several input/output lines. From an input line, it receives a packet. It will check the header of the packet to determine.
1 Lecture 15: Interconnection Routing Topics: deadlock, flow control.
Run-time Adaptive on-chip Communication Scheme 林孟諭 Dept. of Electrical Engineering National Cheng Kung University Tainan, Taiwan, R.O.C.
Forwarding.
Yu Cai Ken Mai Onur Mutlu
Lecture 16: Router Design
Team LDPC, SoC Lab. Graduate Institute of CSIE, NTU Implementing LDPC Decoding on Network-On-Chip T. Theocharides, G. Link, N. Vijaykrishnan, M. J. Irwin.
Intel Slide 1 A Comparative Study of Arbitration Algorithms for the Alpha Pipelined Router Shubu Mukherjee*, Federico Silla !, Peter Bannon $, Joel.
Networks: Routing, Deadlock, Flow Control, Switch Design, Case Studies Alvin R. Lebeck CPS 220.
1 Lecture 22: Router Design Papers: Power-Driven Design of Router Microarchitectures in On-Chip Networks, MICRO’03, Princeton A Gracefully Degrading and.
1 Lecture 24: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix F)
18-WAN Technologies and Dynamic routing Dr. John P. Abraham Professor UTPA.
Module 3 Distributed Multiprocessor Architectures.
1 Lecture 14: Interconnection Networks Topics: dimension vs. arity, deadlock.
Network Layer4-1 Chapter 4 Network Layer All material copyright J.F Kurose and K.W. Ross, All Rights Reserved Computer Networking: A Top Down.
Network On Chip Cache Coherency Final presentation – Part A Students: Zemer Tzach Kalifon Ethan Kalifon Ethan Instructor: Walter Isaschar Instructor: Walter.
A Low-Area Interconnect Architecture for Chip Multiprocessors Zhiyi Yu and Bevan Baas VLSI Computation Lab ECE Department, UC Davis.
© Sudhakar Yalamanchili, Georgia Institute of Technology (except as indicated) Deadlock: Part II - Recovery.
Data Communication Networks Lec 13 and 14. Network Core- Packet Switching.
1 Lecture 22: Interconnection Networks Topics: Routing, deadlock, flow control, virtual channels.
On-time Network On-Chip: Analysis and Architecture CS252 Project Presentation Dai Bui.
COMP8330/7330/7336 Advanced Parallel and Distributed Computing Communication Costs in Parallel Machines Dr. Xiao Qin Auburn University
Mohamed Abdelfattah Vaughn Betz
Presented by: Nick Kirchem Feb 13, 2004
Packet Forwarding.
Pablo Abad, Pablo Prieto, Valentin Puente, Jose-Angel Gregorio
Lecture 23: Router Design
Lecture 16: On-Chip Networks
On-time Network On-chip
Natalie Enright Jerger, Li Shiuan Peh, and Mikko Lipasti
CEG 4131 Computer Architecture III Miodrag Bolic
Low-Latency Virtual-Channel Routers for On-Chip Networks Robert Mullins, Andrew West, Simon Moore Presented by Sailesh Kumar.
Lecture: Interconnection Networks
Network-on-Chip Programmable Platform in Versal™ ACAP Architecture
Peter Bannon Staff Fellow HP
CS 6290 Many-core & Interconnect
Lecture 25: Interconnection Networks
Multiprocessors and Multi-computers
Presentation transcript:

The Alpha Network Architecture Mukherjee, Bannon, Lang, Spink, and Webb Summary Slides by Fred Bower ECE 259, Spring 2004

It’s A Small Paper… …Packed With Detail Overview At High Level Chip Features and Built-In MP Constructs Network, Routing, and Router Basics More Depth Routing Policies Deadlock Avoidance Via Routing Policies What’s In A Router? Discussion

21364 Overview Core With MP Additions MC = Memory Controller Router Directory-Based CC Runs at Core Clock Buffering Capability 1.75 MB L2 Cache Figure 1: The Alpha Floorplan

The Network Topology 2-d Torus Limited Support for Imperfect Tori Allows Fault Remapping Virtual Cut-Through 316* Packet Router Buffer Simple, Adaptive Routing Constrained Within Minimum Rectangle Figure 2: A 12-Processor Network Configuration *316 Total Packets of Buffer Capacity Divided Unevenly Amongst Classes and Ports

Packet Classes Seven Packet Classes Request (3 Flits) Forward (3 Flits) Block Response (18 or 19 Flits) Non-Block Response (2 or 3 Flits) Write I/O (19 Flits) Read I/O (3 Flits) Special (1 or 3 Flits) Flits Are 32 Bits Data Plus 7 Bits ECC

Routing Policies: Minimum Rectangle Four Rectangles With Current and Destination At Diagonals Recall 2-d Torus – All Edges Wrap Constrain Adaptive Routing To Minimum Center of Figure 3 Figure 3: Routing Rectangles

Routing Basics Decode Of Packet Determines Routing Use Of Lookup Tables For Destination Resolution, Virtual Channel Assignments, and Broadcast Invalidation Clusters First Flit Has Routing And Packet Information ECC Checked/Corrected At Each Router Routers May Rewrite ECC Routers Send Feedback About Buffer Availability

Avoiding Coherence Deadlocks Virtual Channels Break Cyclic Dependence Separate Channel For Each Packet Class Guarantees Independence of Class Traffic Additional Ordering Constraint Amongst Classes of Packets Additional Measures To Preserve I/O Consistency Force Same-Class Requests To Arrive In-Order Using Deadlock-Free Virtual Channels Allow I/O Writes To Pass I/O Reads Using Separate Virtual Channels For Reads and Writes Prevent I/O Reads From Passing I/O Writes To Preserve Ordering Rules

Avoiding Routing Deadlocks 19 Virtual Channels 3 Networks For Each of 6 Packet Classes Plus 1 Special Adaptive, VC0, and VC1 Adaptive Is First Choice VC0 and VC1 Provide Guaranteed Drain If Adaptive Blocked Careful Selection of Rules To Break Deadlocks Within Dimensions and Across Dimensions

Internals Of The Router Pipelined Design 9 Pipeline Types Based Upon Input X Output Mapping Input/Output Either Local, Interprocessor, or I/O 13 Cycle In To Out Latency Key To Performance (Smaller Better) Recall Chip-Side At 1.2 GHz Network-Side Speed At 800 MHz Clock Sent With Outgoing Packets

Brief Conclusions Even With Moderate Constraints, Jelly- Bean MP Is Challenging Correctness, Deadlock-Avoidance, Buffering, Arbitration, and Performance Require Careful Consideration In Design This Paper Illustrates Where Network Latency Comes From Even A Fast Network Seems Slow Compared To Local Access

Discussion Was 2-d Torus the Right Shape For This Design? What Are the Limitations Imposed? How Is the 1.2 GHz Internal/800 MHz External Clock Discrepancy OK? Is MP Capability Better Than More Aggressive Core Optimizations For the Transistor Cost? What About SMT, CMP?