D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex.

Slides:



Advertisements
Similar presentations
Numbers Treasure Hunt Following each question, click on the answer. If correct, the next page will load with a graphic first – these can be used to check.
Advertisements

Symantec 2010 Windows 7 Migration EMEA Results. Methodology Applied Research performed survey 1,360 enterprises worldwide SMBs and enterprises Cross-industry.
Symantec 2010 Windows 7 Migration Global Results.
1 A B C
Scenario: EOT/EOT-R/COT Resident admitted March 10th Admitted for PT and OT following knee replacement for patient with CHF, COPD, shortness of breath.
Variations of the Turing Machine
1 EE384Y: Packet Switch Architectures Part II Load-balanced Switch (Borrowed from Isaac Keslassys Defense Talk) Nick McKeown Professor of Electrical Engineering.
Adders Used to perform addition, subtraction, multiplication, and division (sometimes) Half-adder adds rightmost (least significant) bit Full-adder.
Angstrom Care 培苗社 Quadratic Equation II
1 UNIT I (Contd..) High-Speed LANs. 2 Introduction Fast Ethernet and Gigabit Ethernet Fast Ethernet and Gigabit Ethernet Fibre Channel Fibre Channel High-speed.
AP STUDY SESSION 2.
1
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 4 Computing Platforms.
Sequential Logic Design
Processes and Operating Systems
1 Building a Fast, Virtualized Data Plane with Programmable Hardware Bilal Anwer Nick Feamster.
1 Hyades Command Routing Message flow and data translation.
David Burdett May 11, 2004 Package Binding for WS CDL.
1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.
Local Customization Chapter 2. Local Customization 2-2 Objectives Customization Considerations Types of Data Elements Location for Locally Defined Data.
CALENDAR.
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt BlendsDigraphsShort.
On the Correct Sizing on Meshes Through an Effective Congestion Management Strategy P. J. García 1, J. Flich 2, J. Duato 2, I. Johnson 3, F. J. Quiles.
1 Click here to End Presentation Software: Installation and Updates Internet Download CD release NACIS Updates.
A Fractional Order (Proportional and Derivative) Motion Controller Design for A Class of Second-order Systems Center for Self-Organizing Intelligent.
Break Time Remaining 10:00.
Mohamed Hauter CMPE 259 – Sensor Networks UCSC 1.
Turing Machines.
Table 12.1: Cash Flows to a Cash and Carry Trading Strategy.
Local Area Networks - Internetworking
PP Test Review Sections 6-1 to 6-6
EIS Bridge Tool and Staging Tables September 1, 2009 Instructor: Way Poteat Slide: 1.
Chapter 3 Logic Gates.
Outline Minimum Spanning Tree Maximal Flow Algorithm LP formulation 1.
Bellwork Do the following problem on a ½ sheet of paper and turn in.
CS 6143 COMPUTER ARCHITECTURE II SPRING 2014 ACM Principles and Practice of Parallel Programming, PPoPP, 2006 Panel Presentations Parallel Processing is.
Operating Systems Operating Systems - Winter 2012 Chapter 4 – Memory Management Vrije Universiteit Amsterdam.
Operating Systems Operating Systems - Winter 2010 Chapter 3 – Input/Output Vrije Universiteit Amsterdam.
Exarte Bezoek aan de Mediacampus Bachelor in de grafische en digitale media April 2014.
Chapter 20 Network Layer: Internet Protocol
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
1 © 2004, Cisco Systems, Inc. All rights reserved. CCNA 1 v3.1 Module 10 Routing Fundamentals and Subnets.
Adding Up In Chunks.
MaK_Full ahead loaded 1 Alarm Page Directory (F11)
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt Synthetic.
Artificial Intelligence
Dynamic Evolution of Congestion Trees: Analysis and Impact on Switch Architecture P. J. García 1, J. Flich 2, J. Duato 2, I. Johnson 3, F. J. Quiles 1,
Subtraction: Adding UP
1 hi at no doifpi me be go we of at be do go hi if me no of pi we Inorder Traversal Inorder traversal. n Visit the left subtree. n Visit the node. n Visit.
1 Let’s Recapitulate. 2 Regular Languages DFAs NFAs Regular Expressions Regular Grammars.
Speak Up for Safety Dr. Susan Strauss Harassment & Bullying Consultant November 9, 2012.
Essential Cell Biology
FIGURE 12-1 Op-amp symbols and packages.
Converting a Fraction to %
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
Clock will move after 1 minute
1 © 2004, Cisco Systems, Inc. All rights reserved. CCNA 1 v3.1 Module 9 TCP/IP Protocol Suite and IP Addressing.
PSSA Preparation.
Chapter 13: Digital Control Systems 1 ©2000, John Wiley & Sons, Inc. Nise/Control Systems Engineering, 3/e Chapter 13 Digital Control Systems.
Physics for Scientists & Engineers, 3rd Edition
Select a time to count down from the clock above
1.step PMIT start + initial project data input Concept Concept.
1 Dr. Scott Schaefer Least Squares Curves, Rational Representations, Splines and Continuity.
1 Decidability continued…. 2 Theorem: For a recursively enumerable language it is undecidable to determine whether is finite Proof: We will reduce the.
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
FIGURE 3-1 Basic parts of a computer. Dale R. Patrick Electricity and Electronics: A Survey, 5e Copyright ©2002 by Pearson Education, Inc. Upper Saddle.
Presentation transcript:

D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistage Interconnection Networks J. Duato 1, I. Johnson 2, J. Flich 1, F. Naven 2, P.J. García 3, T. Nachiondo 1 1 Technical University of Valencia Valencia, Spain 2 Xyratex Havant, UK 3 University of Castilla-La Mancha Albacete, Spain The Eleventh International Symposium on High-Performance Computer Architecture, San Francisco, 2005

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 3 Outline Introduction Congestion and HOL blocking Why now? Why previous proposals are inadequate Proposal: RECN Performance evaluation Conclusions

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 4 Interconnection Networks MPPs Earth Simulator (640 vectorial CPUs) ASCI Q (12,288 EV68 CPUs, Quadrics network) BlueGene/L ( nodes, each one 2 processors, 360 TFlops) PC Clusters Storage Area Network (SANs) –Google (6.000 CPUs and disks) Thunder (1.024 nodes each one 4 Itaniums/8GB) Many data centers all around the world ASCI Q Earth Simulator Thunder

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 5 Network Throughput beyond Saturation

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 6 Network Contention Several packets request the same output port One makes progress, the others wait Network Congestion Persistent network contention It is quickly propagated by flow control (lossless nets) Network performance degrades dramatically Head of line (HOL) blocking When the first packet in a queue is blocked, any other packet in the same queue is also blocked, even if it will request available resources Congestion and HOL Blocking

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 7 Congestion and HOL Blocking Network contention

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 8 Congestion and HOL Blocking Persistent network contention

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 9 Congestion and HOL Blocking Persistent network contention Flow control

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 10 Congestion and HOL Blocking Persistent network contention Congestion propagates

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 11 Congestion and HOL Blocking Congestion introduces HOL blocking, and this may degrade network performance dramatically 33% HOL 33% 33% 100% 33% 100%

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 12 Traditional Solution Overdimensioning the network Latency Injected traffic Congestion zone Working zone Network bandwidth is much higher than the bandwidth requested by end nodes Low link utilization

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 13 Why Congestion Management Now? New problems arising: System cost: Recent interconnects (Myrinet, InfiniBand, ASI) are expensive compared to processors Power consumption: As network size increases, higher power consumption, higher heat dissipation Frequency/voltage scaling techniques: Not very efficient, and do not solve the system cost problem Possible Solutions: Reducing the number of network components: Possible by using a suitable topology, but link utilization increases Systems will work closer to network saturation zone, thus, a congestion management technique will be mandatory

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 14 Why Current Techniques Are not Suitable? Proactive Congestion Management (congestion prevention) Path setup before data transmission Used in ATM, computer networks (QoS) High overhead, high latencies (not suitable for HPC) The real problem is not the congestion, but its negative effects (HOL blocking) Reactive Congestion Management (congestion recovery) Injection limitation techniques using closed-loop feedback Do not scale with network size and link bandwidth –Notification delay (proportional to distance) –Link capacity (proportional to clock frequency) –May produce network instabilities

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 15 Why Current Techniques Are not Suitable? HOL blocking elimination/reduction DAMQs and Virtual Channels –not efficient for multihop networks VOQ (Virtual Output Queueing) –VOQ at switch level scales but does not eliminate HOL –VOQ at network level: A separate queue at every input port for every destination –Number of required resources scales at least quadratically with network size !!! Credit Flow Controlled ATM –References congestion to network output only –Consumes large number of buffers: A separate queue at every output port for every destination

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 16 Proposal Initial idea: Exploit spatial and temporal locality in packet destinations Manage the set of queues as a cache –No equivalent to main memory!!! (where to replace?) –Not enough locality!!! (reduction in queue silicon area by a factor of 4) Observation: Non-congested flows do not introduce significant HOL blocking RECN: Regional Explicit Congestion Notification Non-congested flows are mapped to the same queue Effective reduction in number of queues and no replacement needed Congested flows are detected and mapped to set aside queues (SAQs) RECN is a scalable congestion management technique because: It reacts locally (and thus, it is not affected by propagation delays) A very small number of queues (SAQs) for a wide range of network sizes RECN enables: Effective reduction of network cost by working closer to the saturation point More efficient use of voltage/frequency scaling techniques

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 17 RECN Based on the PCI Express Advanced Switching Interconnect (ASI) specification Routing (turnpools) Relevant switch architectural features Congestion detection Congestion notification and queue allocation Queue deallocation Packet processing Flow control

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 18 Turnpools turn pool t. pointerD Direction bit Turn example 31 bits = 2 31 destinations AS packet header A A B B 2 Allows to know if a packet will pass through a given port in the network Mask bits required

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 19 Switch Model RAM in XBAR S=1.5 RAM in RAM in... RAM out RAM out RAM out Arbiter... Dynamic queue management (VCs) Dynamic queue management (VCs) LC

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 20 RAM in and RAM out RAM SAQ 0 SAQ 1 SAQ 2 SAQ 3 Cold Queue SAQ 0 SAQ 1 SAQ 2 SAQ 3 Tokens (one per each input port) Root Only at egress: Avoids successive internal notifications v v v v v v v v turn pool mask bits CAM SAQ 0 SAQ 1 SAQ 2 SAQ 3 b b b b b b b b Valid bit Congested point blocked nextSAQ Xon/Xoff Flow control Xoff lv leave bit (only at ingress)

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 21 A congestion point forms How it Works

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 22 How it Works Cold queue fills over a threshold

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 23 How it Works Congestion Detection: Cold Queue at output port side fills over Detection Threshold Congested point: output port SAQs are not allocated at the output port

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 24 How it Works

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 25 How it Works Internal notification to each input port sending packets to the output port

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 26 How it Works Congestion Information Notification: Congestion is notified to input ports sending packets to congested ports Notification includes turnpool information and mask bits Root token set for the input port

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 27 How it Works

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 28 How it Works Input ports allocate a new SAQ for packets addressed to the congested output port

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 29 How it Works Actions after receiving notification: A new SAQ is allocated The notified Turnpool and Mask bits are used to map the new SAQ

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 30 How it Works Reception of packets after mapping SAQs (Example 1): SAQ 0 SAQ 1 SAQ 2 SAQ 3 SAQ 0 SAQ 1 SAQ 2 SAQ 3 Cold Queue SAQ 0 ? ? 3 *

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 31 How it Works Reception of packets after mapping SAQs (Example 2): SAQ 0 SAQ 1 SAQ 2 SAQ 3 SAQ 0 SAQ 1 SAQ 2 SAQ 3 Cold Queue * COLD ? ?

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 32 How it Works

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 33 How it Works Notification sent when the SAQ fills over a threshold

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 34 How it Works RECN SAQ 0 SAQ 1 SAQ 2 SAQ 3 SAQ 0 SAQ 1 SAQ 2 SAQ 3 Cold Queue S No leaf Congestion propagation: A RECN packet including turn pool, mask bits, and SAQ id is sent

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 35 How it Works

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 36 How it Works A new SAQ allocated for the congested port at each output port

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 37 How it Works Internal notification when the SAQ fills over A threshold

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 38 How it Works The input port allocates A new SAQ

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 39 How it Works At the end, the congestion tree builds and is mapped entirely onto SAQs

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 40 Performance Evaluation Evaluation based on simulation results Two evaluation studies: Network performance when using: –RECN –VOQ at network level (VOQnet) –VOQ at switch level (VOQsw) –4 queues at ingress and egress ports (4Q) –1 queue at ingress and egress ports (1Q) RECN scalability

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 41 Simulation Model Network configurations evaluated: 64 hosts connected by a 64x64 BMIN 256 hosts connected by a 256x256 BMIN 512 hosts connected by a 512x512 BMIN Simulation assumptions: BMINs based on shuffle-exchange connection scheme Deterministic routing 128 KB memories at ingress/egress ports Multiplexed crossbar (BW=12 Gbps) Serial full-duplex pipelined links (BW=8 Gbps) 64 and 512-byte packets Credit-based and Xon-Xoff (for SAQs) flow control Maximum of 8 SAQs at ingress/egress ports (RECN)

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 42 Traffic Load Synthetic Traffic: Traces: From I/O activity at cello system disk interface Different compression factors applied # SrcsDst. Injection Rate (%) Traffic Start Time Traffic End Time Corner Case 1 75%Random50%0Sim. End 25%Hot-Spot100%800 μs970 μs Corner Case 2 75%Random100%0Sim. End 25%Hot-Spot100%800 μs970 μs

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 43 Performance Comparison Network throughput - Corner case 1, 64x64 BMIN

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 44 Performance Comparison Network throughput - Corner case 2, 64x64 BMIN

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 45 Performance Comparison Network throughput – Traces, 64x64 BMIN Compression Factor set to 20Compression Factor set to 40

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 46 Scalability Analysis SAQ utilization – Corner Case 1, 64x64 BMIN Maximum # SAQs used (ingress) Maximum # SAQs used (egress) Total # of active SAQS

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 47 Scalability Analysis SAQ utilization – Corner Case 2, 64x64 BMIN Maximum # SAQs used (ingress) Maximum # SAQs used (egress) Total # of active SAQS

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 48 Scalability Analysis SAQ utilization – Traces, Comp. Factor 20, 64x64 BMIN Maximum # SAQs used (ingress) Maximum # SAQs used (egress) Total # of active SAQS

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 49 Scalability Analysis SAQ utilization – Traces, Comp. Factor 40, 64x64 BMIN Maximum # SAQs used (ingress) Maximum # SAQs used (egress) Total # of active SAQS

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 50 Scalability Analysis Network throughput – Corner Case 2, 256x256 BMIN Maximum # SAQs used (egress) Maximum # SAQs used (ingress)

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 51 Scalability Analysis Network throughput – Corner Case 2, 512x512 BMIN Maximum # SAQs used (ingress) Maximum # SAQs used (egress)

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 52 Final Remarks We also designed a protocol to deallocate SAQs when they are no longer needed Many optimizations –CAM IDs to reduce control message size –CAM search done in parallel with packet reception –Merging of congestion trees Silicon area reduced with respect to switch- level VOQs

Title:A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistate Int. Networks Conference: The 11th International Symposium on Hogh-Performance Computer Architecture 53 Conclusions We have proposed a scalable congestion management strategy for lossless networks We have shown that it only requires a small number of buffers for a wide range of network sizes We have modeled an existing ASI switch design, verifying: –Maintains network performance close to ideal (but non-scalable) solution –Silicon area requirements are now smaller than for the original design