HP Labs 1 IEEE Infocom 2003 End-to-End Congestion Control for InfiniBand Jose Renato Santos, Yoshio Turner, John Janakiraman HP Labs.

Slides:



Advertisements
Similar presentations
Congestion Control and Fairness Models Nick Feamster CS 4251 Computer Networking II Spring 2008.
Advertisements

Congestion Control and Fairness Models Nick Feamster CS 4251 Computer Networking II Spring 2008.
Michele Pagano – A Survey on TCP Performance Evaluation and Modeling 1 Department of Information Engineering University of Pisa Network Telecomunication.
A Novel 3D Layer-Multiplexed On-Chip Network
Packet Video TCP Video Streaming to Bandwidth-Limited Access Links Puneet Mehra and Avideh Zakhor Video and Image Processing Lab University of California,
24-1 Chapter 24. Congestion Control and Quality of Service (part 1) 23.1 Data Traffic 23.2 Congestion 23.3 Congestion Control 23.4 Two Examples.
Congestion Control Created by M Bateman, A Ruddle & C Allison As part of the TCP View project.
CS640: Introduction to Computer Networks Mozafar Bag-Mohammadi Lecture 3 TCP Congestion Control.
Congestion Control: TCP & DC-TCP Swarun Kumar With Slides From: Prof. Katabi, Alizadeh et al.
Mathematical models of the Internet Frank Kelly Hood Fellowship Public Lecture University of Auckland 3 April 2012.
Router-assisted congestion control Lecture 8 CS 653, Fall 2010.
“ Analysis of the Increase and Decrease Algorithms for Congestion Avoidance in Computer Networks ”
Advanced Computer Networking Congestion Control for High Bandwidth-Delay Product Environments (XCP Algorithm) 1.
Congestion Control An Overview -Jyothi Guntaka. Congestion  What is congestion ?  The aggregate demand for network resources exceeds the available capacity.
XCP: Congestion Control for High Bandwidth-Delay Product Network Dina Katabi, Mark Handley and Charlie Rohrs Presented by Ao-Jan Su.
1 Asynchronous Bit-stream Compression (ABC) IEEE 2006 ABC Asynchronous Bit-stream Compression Arkadiy Morgenshtein, Avinoam Kolodny, Ran Ginosar Technion.
Texas A&M University Improving TCP Performance in High Bandwidth High RTT Links Using Layered Congestion Control Sumitha.
Advanced Computer Networks: RED 1 Random Early Detection Gateways for Congestion Avoidance * Sally Floyd and Van Jacobson, IEEE Transactions on Networking,
Explicit Congestion Notification ECN Tilo Hamann Technical University Hamburg-Harburg, Germany.
Bandwidth sharing: objectives and algorithms Jim Roberts France Télécom - CNET Laurent Massoulié Microsoft Research.
RAP: An End-to-End Rate-Based Congestion Control Mechanism for Realtime Streams in the Internet Reza Rejai, Mark Handley, Deborah Estrin U of Southern.
Week 9 TCP9-1 Week 9 TCP 3 outline r 3.5 Connection-oriented transport: TCP m segment structure m reliable data transfer m flow control m connection management.
RCS: A Rate Control Scheme for Real-Time Traffic in Networks with High B X Delay and High error rates J. Tang et al, Infocom 2001 Another streaming control.
1 TCP Transport Control Protocol Reliable In-order delivery Flow control Responds to congestion “Nice” Protocol.
1 Random Early Detection Gateways for Congestion Avoidance Sally Floyd and Van Jacobson, IEEE Transactions on Networking, Vol.1, No. 4, (Aug 1993), pp
Data Communication and Networks
ACN: Congestion Control1 Congestion Control and Resource Allocation.
Random Early Detection Gateways for Congestion Avoidance
Congestion Control for High Bandwidth-Delay Product Environments Dina Katabi Mark Handley Charlie Rohrs.
1 A State Feedback Control Approach to Stabilizing Queues for ECN- Enabled TCP Connections Yuan Gao and Jennifer Hou IEEE INFOCOM 2003, San Francisco,
Advanced Computer Networks : RED 1 Random Early Detection Gateways for Congestion Avoidance Sally Floyd and Van Jacobson, IEEE Transactions on Networking,
The Effects of Systemic Packets Loss on Aggregate TCP Flows Thomas J. Hacker May 8, 2002 Internet 2 Member Meeting.
3: Transport Layer3b-1 Principles of Congestion Control Congestion: r informally: “too many sources sending too much data too fast for network to handle”
Transport Layer3-1 Chapter 3 outline r 3.1 Transport-layer services r 3.2 Multiplexing and demultiplexing r 3.3 Connectionless transport: UDP r 3.4 Principles.
Network Technologies essentials Week 8: TCP congestion control Compilation made by Tim Moors, UNSW Australia Original slides by David Wetherall, University.
CS540/TE630 Computer Network Architecture Spring 2009 Tu/Th 10:30am-Noon Sue Moon.
Principles of Congestion Control Congestion: informally: “too many sources sending too much data too fast for network to handle” different from flow control!
CSE 461 University of Washington1 Topic How TCP implements AIMD, part 1 – “Slow start” is a component of the AI portion of AIMD Slow-start.
Link Scheduling & Queuing COS 461: Computer Networks
ACN: RED paper1 Random Early Detection Gateways for Congestion Avoidance Sally Floyd and Van Jacobson, IEEE Transactions on Networking, Vol.1, No. 4, (Aug.
Korea Advanced Institute of Science and Technology Network Systems Lab. 1 Dual-resource TCP/AQM for processing-constrained networks INFOCOM 2006, Barcelona,
27th, Nov 2001 GLOBECOM /16 Analysis of Dynamic Behaviors of Many TCP Connections Sharing Tail-Drop / RED Routers Go Hasegawa Osaka University, Japan.
High-speed TCP  FAST TCP: motivation, architecture, algorithms, performance (by Cheng Jin, David X. Wei and Steven H. Low)  Modifying TCP's Congestion.
Rate Adaptation Protocol for Real-time Streams Goal: develop an end-to-end TCP-friendly RAP for semi-reliable rate-based applications (e.g. playback of.
Congestion Control for High Bandwidth-Delay Product Networks D. Katabi (MIT), M. Handley (UCL), C. Rohrs (MIT) – SIGCOMM’02 Presented by Cheng.
TCP with Variance Control for Multihop IEEE Wireless Networks Jiwei Chen, Mario Gerla, Yeng-zhong Lee.
Efficient Cache Structures of IP Routers to Provide Policy-Based Services Graduate School of Engineering Osaka City University
Lecture 9 – More TCP & Congestion Control
CS640: Introduction to Computer Networks Aditya Akella Lecture 15 TCP – III Reliability and Implementation Issues.
Computer Networking Lecture 18 – More TCP & Congestion Control.
David Wetherall Professor of Computer Science & Engineering Introduction to Computer Networks Fairness of Bandwidth Allocation (§6.3.1)
The Macroscopic behavior of the TCP Congestion Avoidance Algorithm.
Receiver Driven Bandwidth Sharing for TCP Authors: Puneet Mehra, Avideh Zakor and Christophe De Vlesschouwer University of California Berkeley. Presented.
TCP transfers over high latency/bandwidth networks & Grid DT Measurements session PFLDnet February 3- 4, 2003 CERN, Geneva, Switzerland Sylvain Ravot
© Janice Regan, CMPT 128, CMPT 371 Data Communications and Networking Congestion Control 0.
ECEN 619, Internet Protocols and Modeling Prof. Xi Zhang Random Early Detection Gateways for Congestion Avoidance Sally Floyd and Van Jacobson, IEEE Transactions.
CS640: Introduction to Computer Networks Aditya Akella Lecture 15 TCP Congestion Control.
Analysis of the increase and Decrease Algorithms for Congestion in Computer Networks Portions of the slide/figures were adapted from :
1 Three ways to (ab)use Multipath Congestion Control Costin Raiciu University Politehnica of Bucharest.
1 Transport Bandwidth Allocation 3/29/2012. Admin. r Exam 1 m Max: 65 m Avg: 52 r Any questions on programming assignment 2 2.
HP Labs 1 CSC talk 06/13/2003 End-to-End Congestion Control for System Area Networks Renato Santos, Yoshio Turner, John Janakiraman Linux Systems, Networks.
Corelite Architecture: Achieving Rated Weight Fairness
Approaches towards congestion control
CS 268: Lecture 6 Scott Shenker and Ion Stoica
Congestion Control and Resource Allocation
Lecture 19 – TCP Performance
RAP: Rate Adaptation Protocol
TCP Congestion Control
Review of Internet Protocols Transport Layer
Presentation transcript:

HP Labs 1 IEEE Infocom 2003 End-to-End Congestion Control for InfiniBand Jose Renato Santos, Yoshio Turner, John Janakiraman HP Labs

2 IEEE Infocom 2003 Outline Motivation: Unique System Area Network (SAN) characteristics require new congestion control approach Proposed approach appropriate for SANs: –ECN packet marking –Source response: rate control with window limit Focus: Design of source response functions –New convergence conditions, design methodology –New functions: LIPD and FIMD Performance Evaluation: LIPD, FIMD, AIMD Conclusions

HP Labs 3 IEEE Infocom 2003 System Area Networks Characteristics InfiniBand example: Industry standard server interconnect – 2Gb/s(1x) to 24Gb/s(12x) links Characteristics: congestion control implications –No packet dropping  Need network support for detecting congestion – Low network latency (tens of ns cut-through switching)  Simple logic for hardware implementation – Low buffer capacity at switches (e.g., 2KB input buffer stores only four 512-byte packets)  TCP window mechanism inadequate (narrow operational range) – Input-buffered switches  Alternative congestion detection mechanisms

HP Labs 4 IEEE Infocom 2003 Problem: Congestion Spreading Link BW: 8 Gb/s (4x link) Packet Size: 2 KB Buffer Size: 4 packets/port (8 KB) Buffer Org.: Input port Flow not using congested link suffers performance degradation (victim flow) Simulation (R=L=10) Remote flows use only 30% of inter- switch link bandwidth Contention for root link  full buffer  prevents victim flow from using remaining inter- switch link bandwidth non-congested link local flows (L) SWITCH A SWITCH B congested link remote flows (R) victim flow Inter-switch Link

HP Labs 5 IEEE Infocom 2003 Our Congestion Control Approach Explicit Congestion Notification (ECN) for input-buffered switches Source adjusts packet injection according to network feedback encoded in ECN returned via ACK –Combines window and rate control –New source response functions more efficient than AIMD

HP Labs 6 IEEE Infocom 2003 Source Response: Rate Control with Window Limit Window Control + Self-clocked, bounds switch buffer utilization – Narrow operational range (window=2 uses all bandwidth in idle network) – Window=1 is too large if # flows > # buffer slots Rate Control + Low buffer util. possible (< 1 packet per flow) + Wide operational range – Not self-clocked Proposed Approach: Rate control with a fixed window limit (w=1)

HP Labs 7 IEEE Infocom 2003 Designing Rate Control Functions Definition: When source receives ACK Decrease rate on marked ACK: r new = f dec (r) Increase rate on unmarked ACK: r new = f inc (r) f dec (r) and f inc (r) should provide : –Congestion avoidance –High network bandwidth utilization –Fair allocation of bandwidth among flows Develop new sufficient conditions for f dec (r) & f inc (r) –Exploit differences in packet marking rates across flows to relax conditions Requires novel time-based formulation

HP Labs 8 IEEE Infocom 2003 Avoiding Congested State Steady state: flow rate oscillates around optimal value in alternating phases of rate decrease and increase Want to avoid time in congested state Magnitude of response to marked ACK is larger or equal to magnitude of response to unmarked ACK Congestion Avoidance Condition: f inc (f dec (r))  r

HP Labs 9 IEEE Infocom 2003 Fairness Convergence [Chiu/Jain 1989][Bansal/Balakrishnan 2001] developed convergence conditions assuming all flows receive feedback and adjust rates synchronously –Each increase/decrease cycle must improve fairness Observation: In congested state, the mean number of marked packets for a flow is proportional to the flow rate. –bias promotes flow rate fairness  Enables weaker fairness convergence condition  Benefit: fairness with faster rate recovery

HP Labs 10 IEEE Infocom 2003 Fairness Convergence Relax condition: rate decrease-increase cycles need only maintain fairness in the synchronous case –If two flows receive marks, lower rate flow should recover earlier than or in the same time as higher rate flow t F inc (t) time rate.. r f dec (r) T rec (r) decrease step Rate as function of time (in absence of marks) Fairness Convergence Condition: T rec (r1)  T rec (r2) for r1 < r2

HP Labs 11 IEEE Infocom 2003 Maximizing Bandwidth Utilization Goal: as flows depart, remaining flows should recover rate quickly to maximize utilization Fastest recovery: use limiting cases of conditions –Congestion Avoidance Condition f inc (f dec (r))  r Use f inc (f dec (r)) = r for minimum rate R min –Fairness Convergence Condition T rec (r1)  T rec (r2) Use T rec (r1) = T rec (r2) for higher rates Maximum Bandwidth Utilization Condition: T rec (r) = 1/ R min for all r

HP Labs 12 IEEE Infocom 2003 Design Methodology: Choose f dec (r), find f inc (r) satisfying conditions Use f dec (r) to derive F inc (t): F inc (t) = f dec (F inc (t + T rec )), T rec =1/R min Use F inc (t) to find f inc (r): f inc (r ) = F inc (t r +1/r) where F inc (t r ) = r t F inc (t) time rate r f dec (r) T rec =1/R min decrease step t F inc (t) time rate r f inc (r) 1/r increase step trtr

HP Labs 13 IEEE Infocom 2003 New Response Functions Fast Increase Multiplicative Decrease (FIMD): –Decrease function: f dec fimd (r) = r/m, constant m>1 (same as AIMD) –Increase function: f inc fimd (r) = r · m Rmin/r –Much faster rate recovery than AIMD Linear Inter-Packet Delay (LIPD): –Decrease function: increases inter-packet delay (ipd) by 1 packet transmission time r = R max /(ipd+1) –Increase function: f inc lipd (r) = r/(1- R min /R max ) –Large decreases at high rate, small decreases at low rate Simple Implementation: e.g., table lookup

HP Labs 14 IEEE Infocom 2003 Increase Behavior Over Time : FIMD, AIMD, LIPD K10K20K30K40K50K60K 65K normalized rate time (units of packet transmission time) FIMD (m=2) AIMD (m=2) LIPD F inc (t)

HP Labs 15 IEEE Infocom 2003 Performance: Source Response Functions root link (RL) inter-switch link (IL) local flows (LF) remote flows (RF) RLIL normalized rate buffer size (packets) LIPD normalized rate buffer size (packets) FIMD normalized rate buffer size (packets) AIMD

HP Labs 16 IEEE Infocom 2003 Conclusions Proposed/Evaluated congestion control approach appropriate for unique characteristics of SANs such as InfiniBand –ECN applicable to modern input-queued switches –Source response: rate control w/ window limit Derived new relaxed conditions for source response function convergence  functions with fast bandwidth reclamation –Based on observation of packet marking bias –Two examples: FIMD/LIPD outperform AIMD Future extensions: –Hybrid window-rate control (allow w > 1) –Evaluation with richer traffic patterns/topologies

HP Labs 17 IEEE Infocom 2003