Open Issues in Buffer Sizing Amogh Dhamdhere Constantine Dovrolis College of Computing Georgia Tech.

Slides:

Advertisements

Similar presentations

Martin Suchara, Ryan Witt, Bartek Wydrowski California Institute of Technology Pasadena, U.S.A. TCP MaxNet Implementation and Experiments on the WAN in.

Advertisements

Michele Pagano – A Survey on TCP Performance Evaluation and Modeling 1 Department of Information Engineering University of Pisa Network Telecomunication.

Computer Networking Lecture 20 – Queue Management and QoS.

CSIT560 Internet Infrastructure: Switches and Routers Active Queue Management Presented By: Gary Po, Henry Hui and Kenny Chong.

24-1 Chapter 24. Congestion Control and Quality of Service (part 1) 23.1 Data Traffic 23.2 Congestion 23.3 Congestion Control 23.4 Two Examples.

Doc.: IEEE /0604r1 Submission May 2014 Slide 1 Modeling and Evaluating Variable Bit rate Video Steaming for ax Date: Authors:

Congestion Control: TCP & DC-TCP Swarun Kumar With Slides From: Prof. Katabi, Alizadeh et al.

Ahmed El-Hassany CISC856: CISC 856 TCP/IP and Upper Layer Protocols Slides adopted from: Injong Rhee, Lisong Xu.

Advanced Computer Networking Congestion Control for High Bandwidth-Delay Product Environments (XCP Algorithm) 1.

XCP: Congestion Control for High Bandwidth-Delay Product Network Dina Katabi, Mark Handley and Charlie Rohrs Presented by Ao-Jan Su.

TCP Stability and Resource Allocation: Part II. Issues with TCP Round-trip bias Instability under large bandwidth-delay product Transient performance.

Sizing Router Buffers Guido Appenzeller Isaac Keslassy Nick McKeown Stanford University.

The War Between Mice and Elephants Presented By Eric Wang Liang Guo and Ibrahim Matta Boston University ICNP

Congestion Control Tanenbaum 5.3, /12/2015Congestion Control (A Loss Based Technique: TCP)2 What? Why? Congestion occurs when –there is no reservation.

High Performance All-Optical Networks with Small Buffers Yashar Ganjali High Performance Networking Group Stanford University

AQM for Congestion Control1 A Study of Active Queue Management for Congestion Control Victor Firoiu Marty Borden.

Buffer Sizing for Congested Internet Links Chi Yin Cheung Cs 395 Advanced Networking.

High speed TCP’s. Why high-speed TCP? Suppose that the bottleneck bandwidth is 10Gbps and RTT = 200ms. Bandwidth delay product is packets (1500.

Sizing Router Buffers Nick McKeown Guido Appenzeller & Isaac Keslassy SNRC Review May 27 th, 2004.

TCP Congestion Control TCP sources change the sending rate by modifying the window size: Window = min {Advertised window, Congestion Window} In other words,

Modeling TCP in Small-Buffer Networks

1 Chapter 3 Transport Layer. 2 Chapter 3 outline 3.1 Transport-layer services 3.2 Multiplexing and demultiplexing 3.3 Connectionless transport: UDP 3.4.

Reducing the Buffer Size in Backbone Routers Yashar Ganjali High Performance Networking Group Stanford University February 23, 2005

1 Emulating AQM from End Hosts Presenters: Syed Zaidi Ivor Rodrigues.

Data Communication and Networks

The War Between Mice and Elephants By Liang Guo (Graduate Student) Ibrahim Matta (Professor) Boston University ICNP’2001 Presented By Preeti Phadnis.

TCP Congestion Control

Core Stateless Fair Queueing Stoica, Shanker and Zhang - SIGCOMM 98 Rigorous fair Queueing requires per flow state: too costly in high speed core routers.

Congestion Control for High Bandwidth-Delay Product Environments Dina Katabi Mark Handley Charlie Rohrs.

Ns Simulation Final presentation Stella Pantofel Igor Berman Michael Halperin

1 A State Feedback Control Approach to Stabilizing Queues for ECN- Enabled TCP Connections Yuan Gao and Jennifer Hou IEEE INFOCOM 2003, San Francisco,

Advanced Computer Networks : RED 1 Random Early Detection Gateways for Congestion Avoidance Sally Floyd and Van Jacobson, IEEE Transactions on Networking,

Core Stateless Fair Queueing Stoica, Shanker and Zhang - SIGCOMM 98 Fair Queueing requires per flow state: too costly in high speed core routers Yet, some.

Courtesy: Nick McKeown, Stanford 1 TCP Congestion Control Tahir Azim.

Transport Layer3-1 Chapter 3 outline r 3.1 Transport-layer services r 3.2 Multiplexing and demultiplexing r 3.3 Connectionless transport: UDP r 3.4 Principles.

Transport Layer3-1 Chapter 3 outline r 3.1 Transport-layer services r 3.2 Multiplexing and demultiplexing r 3.3 Connectionless transport: UDP r 3.4 Principles.

Congestion models for bursty TCP traffic Damon Wischik + Mark Handley University College London DARPA grant W911NF

Understanding the Performance of TCP Pacing Amit Aggarwal, Stefan Savage, Thomas Anderson Department of Computer Science and Engineering University of.

ACN: RED paper1 Random Early Detection Gateways for Congestion Avoidance Sally Floyd and Van Jacobson, IEEE Transactions on Networking, Vol.1, No. 4, (Aug.

1 On Class-based Isolation of UDP, Short-lived and Long-lived TCP Flows by Selma Yilmaz Ibrahim Matta Computer Science Department Boston University.

27th, Nov 2001 GLOBECOM /16 Analysis of Dynamic Behaviors of Many TCP Connections Sharing Tail-Drop / RED Routers Go Hasegawa Osaka University, Japan.

Measuring the Congestion Responsiveness of Internet Traffic Ravi Prasad & Constantine Dovrolis Networking and Telecommunications Group College of Computing.

1 IEEE Meeting July 19, 2006 Raj Jain Modeling of BCN V2.0 Jinjing Jiang and Raj Jain Washington University in Saint Louis Saint Louis, MO

Queueing and Active Queue Management Aditya Akella 02/26/2007.

Transport Layer 3-1 Chapter 3 Transport Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March

CS640: Introduction to Computer Networks Aditya Akella Lecture 15 TCP – III Reliability and Implementation Issues.

CS640: Introduction to Computer Networks Aditya Akella Lecture 15 TCP – III Reliability and Implementation Issues.

Transport Layer3-1 Chapter 3 outline r 3.1 Transport-layer services r 3.2 Multiplexing and demultiplexing r 3.3 Connectionless transport: UDP r 3.4 Principles.

Jennifer Rexford Fall 2014 (TTh 3:00-4:20 in CS 105) COS 561: Advanced Computer Networks TCP.

Analysis of RED Goal: impact of RED on loss and delay of bursty (TCP) and less bursty or smooth (UDP) traffic RED eliminates loss bias against bursty traffic.

We used ns-2 network simulator [5] to evaluate RED-DT and compare its performance to RED [1], FRED [2], LQD [3], and CHOKe [4]. All simulation scenarios.

Winter 2008CS244a Handout 71 CS244a: An Introduction to Computer Networks Handout 7: Congestion Control Nick McKeown Professor of Electrical Engineering.

Time-Dependent Dynamics in Networked Sensing and Control Justin R. Hartman Michael S. Branicky Vincenzo Liberatore.

Random Early Detection (RED) Router notifies source before congestion happens - just drop the packet (TCP will timeout and adjust its window) - could make.

Chapter 11.4 END-TO-END ISSUES. Optical Internet Optical technology Protocol translates availability of gigabit bandwidth in user-perceived QoS.

TCP transfers over high latency/bandwidth networks & Grid DT Measurements session PFLDnet February 3- 4, 2003 CERN, Geneva, Switzerland Sylvain Ravot

Analysis and Design of an Adaptive Virtual Queue (AVQ) Algorithm for AQM By Srisankar Kunniyur & R. Srikant Presented by Hareesh Pattipati.

© Janice Regan, CMPT 128, CMPT 371 Data Communications and Networking Congestion Control 0.

1 Transport Bandwidth Allocation 3/29/2012. Admin. r Exam 1 m Max: 65 m Avg: 52 r Any questions on programming assignment 2 2.

1 Network Transport Layer: TCP Analysis and BW Allocation Framework Y. Richard Yang 3/30/2016.

1 Flow & Congestion Control Some slides are from lectures by Nick Mckeown, Ion Stoica, Frans Kaashoek, Hari Balakrishnan, and Sam Madden Prof. Dina Katabi.

Topics discussed in this section:

Chapter 3 outline 3.1 transport-layer services

CUBIC Marcos Vieira.

Open Issues in Router Buffer Sizing

Lecture 19 – TCP Performance

Amogh Dhamdhere, Hao Jiang and Constantinos Dovrolis

CS640: Introduction to Computer Networks

Transport Layer: Congestion Control

Presentation transcript:

Open Issues in Buffer Sizing Amogh Dhamdhere Constantine Dovrolis College of Computing Georgia Tech

Outline Motivation and previous work The Stanford model for buffer sizing Important issues in buffer sizing Simulation results for the Stanford model Buffer sizing for bounded loss rate (Infocom’05)

Motivation Router buffers are crucial elements of packet networks Absorb rate variations of incoming traffic Prevent packet losses during traffic bursts Increasing the router buffer size: Can increase link utilization (especially with TCP traffic) Can decrease packet loss rate Can also increase queuing delays

Common operational practices Major router vendor recommends 500ms of buffering Implication: buffer size increases proportionally to link capacity Why 500ms? Bandwidth Delay Product (BDP) rule: Buffer size B = link capacity C x typical RTT T (B = CxT) What does “typical RTT” mean? Measurement studies showed that RTTs vary from 1ms to 10sec! How do different types of flows (TCP elephants vs mice) affect buffer requirement? Poor performance is often due to buffer size: Under-buffered switches: high loss rate and poor utilization Over-buffered DSL modems: excessive queuing delay for interactive apps

Previous work Approaches based on queuing theory (e.g. M|M|1|B) Assume a certain input traffic model, service model and buffer size Loss probability for M|M|1|B system is given by TCP is not open-loop; TCP flows react to congestion There is no universally accepted Internet traffic model Morris’ Flow Proportional Queuing (Infocom ’00) Proposed a buffer size proportional to the number of active TCP flows (B = 6*N) Did not specify which flows to count in N Objective: limit loss rate High loss rate causes unfairness and poor application performance

TCP window dynamics for long flows TCP-aware buffer sizing must take into account TCP dynamics Saw-tooth behavior Window increases until packet loss Single loss results in cwnd reduction by factor of two Square-root TCP model TCP throughput can be approximated by Valid when loss rate p is small (less than 2-5%) Average window size is independent of RTT Loss RateRTT

Origin of BDP rule Consider a single flow with RTT T Window follows TCP’s saw-tooth behavior Maximum window size = CT + B At this point packet loss occurs Window size after packet loss = (CT + B)/2 Key step: Even when window size is minimum, link should be fully utilized (CT + B)/2 ≥ CT which means B ≥ CT Known as the bandwidth delay product rule Same result for N homogeneous TCP connections

Outline Motivation and previous work The Stanford model for buffer sizing Important issues in buffer sizing Simulation results for the Stanford model Buffer sizing for bounded loss rate (BSCL)

Stanford Model - Appenzeller et al. Objective: Find the minimum buffer size to achieve full utilization of target link Assumption: Most traffic is from TCP flows If N is large, flows are independent and unsynchronized Aggregate window size distribution tends to normal Queue size distribution also tends to normal Flows in congestion avoidance (linear increase of window between successive packet drops) Buffer for full utilization is given by N is the number of “long” flows at the link CT: Bandwidth delay product

Stanford Model (cont’) If link has only short flows, buffer size depends only on offered load and average flow size Flow size determines the size of bursts during slow start For a mix of short and long flows, buffer size is determined by number of long flows Small flows do not have a significant impact on buffer sizing Resulting buffer can achieve full utilization of target link Loss rate at target link is not taken into account

Outline Motivation and previous work The Stanford model for buffer sizing Important issues in buffer sizing Simulation results for the Stanford model Buffer sizing for bounded loss rate (BSCL)

What are the objectives ? Network layer vs. application layer objectives Network’s perspective: Utilization, loss rate, queuing delay User’s perspective: Per-flow throughput, fairness etc. Stanford Model: Focus on utilization & queueing delay Can lead to high loss rate (> 10% in some cases) BSCL: Both utilization and loss rate Can lead to large queuing delay Buffer sizing scheme that bounds queuing delay Can lead to high loss rate and low utilization A certain buffer size cannot meet all objectives Which problem should we try to solve?

Saturable/congestible links A link is saturable when offered load is sufficient to fully utilize it, given large enough buffer A link may not be saturable at all times Some links may never be saturable Advertised-window limitation, other bottlenecks, size-limited Small buffers are sufficient for non-saturable links Only needed to absorb short term traffic bursts Stanford model applicable: when N is large Backbone links are usually not saturable due to over- provisioning Edge links are more likely to be saturable But N may not be large for such links

Which flows to count ? N: Number of “long” flows at the link “Long” flows show TCP’s saw-tooth behavior “Short” flows do not exit slow start Does size matter? Size does not indicate slow start or congestion avoidance behavior If no congestion, even large flows do not exit slow start If highly congested, small flows can enter congestion avoidance Should the following flows be included in N ? Flows limited by congestion at other links Flows limited by sender/receiver socket buffer size N varies with time. Which value should we use ? Min ? Max ? Time average ?

Which traffic model to use ? Traffic model has major implications on buffer sizing Early work considered traffic as exogenous process Not realistic. The offered load due to TCP flows depends on network conditions Stanford model considers mostly persistent connections No ambiguity about number of “long” flows (N) N is time-invariant In practice, TCP connections have finite size and duration, and N varies with time Open-loop vs closed-loop flow arrivals

Traffic model (cont’) Open-loop TCP traffic: Flows arrive randomly with average size S, average rate Offered load S, link capacity C Offered load is independent of system state (delay, loss) The system is unstable if S > C Closed-loop TCP traffic: Each user starts a new transfer only after the completion of previous transfer Random think time between consecutive transfers Offered load depends on system state The system can never be unstable

Outline Motivation and previous work The Stanford model for buffer sizing Important issues in buffer sizing Simulation results for the Stanford model Buffer sizing for bounded loss rate (BSCL)

Why worry about loss rate? The Stanford model gives very small buffer if N is large E.g., CT=200 packets, N=400 flows: B=10 packets What is the loss rate with such a small buffer size? Per-flow throughput and transfer latency? Compare with BDP-based buffer sizing Distinguish between large and small flows Small flows that do not see losses: limited only by RTT Flow size: k segments Large flows depend on both losses & RTT:

Simulation setup Use ns-2 simulations to study the effect of buffer size on loss rate for different traffic models Heterogeneous RTTs (20ms to 530ms) TCP NewReno with SACK option BDP = 250 packets (1500 B) Model-1: persistent flows + mice 200 “infinite” connections – active for whole simulation duration mice flows - 5% of capacity, size between 3 and 25 packets, exponential inter-arrivals

Simulation setup (cont’) Flow size distribution for finite size flows: Sum of 3 exponential distributions: Small files (avg. 15 packets), medium files (avg. 50 packets) and large files (avg. 200 packets) 70% of total bytes come from the largest 30% of flows Model-2: Closed-loop traffic 675 source agents Think time exponentially distributed with average 5 s Time average of 200 flows in congestion avoidance Model-3: Open-loop traffic Exponentially distributed flow inter-arrival times Offered load is 95% of link capacity Time average of 200 flows in congestion avoidance

Simulation results – Loss rate CT=250 packets, N=200 for all traffic types Stanford model gives a buffer of 18 packets High loss rate with Stanford buffer Greater than 10% for open loop traffic 7-8% for persistent and closed loop traffic Increasing buffer to BDP or small multiple of BDP can significantly decrease loss rate Stanford buffer

Per-flow throughput Transfer latency = flow-size / flow-throughput Flow throughput depends on both loss rate and queuing delay Loss rate decreases with buffer size (good) Queuing delay increases with buffer size (bad) Major tradeoff: Should we have low loss rate or low queuing delay ? Answer depends on various factors Which flows are considered: Long or short ? Which traffic model is considered?

Persistent connections and mice Application layer throughput for B=18 (Stanford buffer) and larger buffer B=500 Two flow categories: Large (>100KB) and small (<100KB) Majority of large flows get better throughput with large buffer Large difference in loss rates Smaller variability of per-flow throughput with larger buffer Majority of short flows get better throughput with small buffer Lower RTT and smaller difference in loss rates

Closed-loop traffic Per-flow throughput for large flows is slightly better with larger buffer Majority of small flows see better throughput with smaller buffer Similar to persistent case Not a significant difference in per-flow loss rate Reason: Loss rate decreases slowly with buffer size

Open-loop traffic Both large and small flows get much better throughput with large buffer Significantly smaller per-flow loss rate with larger buffer Reason: Loss rate decreases very quickly with buffer size

Outline Motivation and previous work The Stanford model for buffer sizing Important issues in buffer sizing Simulation results for the Stanford model Buffer sizing for bounded loss rate (BSCL)

Our buffer sizing objectives Full utilization: The average utilization of the target link should be at least % when the offered load is sufficiently high Bounded loss rate: The loss rate p should not exceed, typically 1-2% for a saturated link Minimum queuing delays and buffer requirement, given previous two objectives: Large queuing delay causes higher transfer latencies and jitter Large buffer size increases router cost and power consumption So, we aim to determine the minimum buffer size that meets the given utilization and loss rate constraints

Why limit the loss rate? End-user perceived performance is very poor when loss rate is more than 5-10% Particularly true for short and interactive flows High loss rate is also detrimental for large TCP flows High variability in per-flow throughput Some “unlucky” flows suffer repeated losses and timeouts We aim to bound the packet loss rate to = 1-2%

Traffic classes Locally Bottlenecked Persistent (LBP) TCP flows Large TCP flows limited by losses at target link Loss rate p is equal to loss rate at target link Remotely Bottlenecked Persistent (RBP) TCP flows Large TCP flows limited by losses at other links Loss rate is greater than loss rate at target link Window Limited Persistent TCP flows Large TCP flows limited by advertised window, instead of congestion window Short TCP flows and non-TCP traffic

Scope of our model Key assumption: LBP flows account for most of the traffic at the target link (80-90 %) Reason: we ignore buffer requirement of non-LBP traffic Scope of our model: Congested links that mostly carry large TCP flows, bottlenecked at target link

Minimum buffer requirement for full utilization: homogenous flows Consider a single LBP flow with RTT T Window follows TCP’s saw-tooth behavior Maximum window size = CT + B At this point packet loss occurs Window size after packet loss = (CT + B)/2 Key step: Even when window size is minimum, link should be fully utilized (CT + B)/2 >= CT which means B >= CT Known as the bandwidth delay product rule Same result for N homogeneous TCP connections

Minimum buffer requirement for full utilization: heterogeneous flows N b heterogeneous LBP flows with RTTs {T i } Initially, assume Global Loss Synchronization All flows decrease windows simultaneously in response to single congestion event We derive that: As a bandwidth-delay product: T e : “effective RTT” is the harmonic mean of RTTs Practical Implication: Few connections with very large RTTs cannot significantly increase buffer requirement, as long as most flows have small RTTs

Minimum buffer requirement for full utilization (cont’) More realistic model: partial loss synchronization Loss burst length L(N b ): number of packets lost by N b flows during single congestion event Assumption: loss burst length increases almost linearly with N b, i.e., L(N b ) = α N b α : synchronization factor (around in our simulations) Minimum buffer size requirement: : Fraction of flows that see losses in a congestion event M: Average segment size Partial loss synchronization reduces buffer requirement

Validation (ns2 simulations) Heterogeneous flows (RTTs vary between 20ms & 530ms) Partial synchronization model: accurate Global synchronization (deterministic) model overestimates buffer requirement by factor 3-5

Relation between loss rate and N N b homogeneous LBP flows at target link Link capacity: C, flows’ RTT: T If flows saturate target link, then flow throughput is given by Loss rate is proportional to square of N b Hence, to keep loss rate less than we must limit number of flows But this would require admission control (not deployed)

Flow Proportional Queuing (FPQ) First proposed by Morris (Infocom’00) Bound loss rate by: Increasing RTT proportionally to number of flows Solving for T gives: Whereand T p : RTT’s propagation delay Set T q  C/B, and solve for B: Window of each flow should be K p packets, consisting of Packets in target link buffer (B term) Packets “on the wire” (CT p term) Practically, K p =6 packets for 2% loss rate, and K p =9 packets for 1% loss rate

Buffer size requirement for both full utilization and bounded loss rate We previously showed separate results for full utilization and bounded loss rate To meet both goals, provide enough buffers to satisfy most stringent of two requirements Buffer requirement: Decreases with N b (full utilization objective) Increases with N b (loss rate objective) Crossover point: Previous result is referred to as the BSCL formula

Model validation Heterogeneous flows Utilization % and loss constraint % Utilization constraint Loss rate constraint

Parameter estimation 1. Number of LBP flows: With LBP flows, all rate reductions occur due to packet losses at target link RBP flows: some rate reductions due to losses elsewhere 2. Effective RTT: Jiang et al. (2002): simple algorithms to measure TCP RTT from packet traces 3. Loss burst lengths or loss synchronization factor: Measure loss burst lengths from packet loss trace or use approximation L(N b ) = α N b

Results: Bound loss rate to 1%

Per-flow throughput with BSCL BSCL can achieve network layer objectives of full utilization and bounded loss rate Can lead to large queuing delay due to larger buffer How does this affect application throughput ? BSCL loss rate target set to 1% BSCL buffer size is 1550 packets Compare with the buffer of 500 packets BSCL is able to bound the loss rate to 1% target for all traffic models

Persistent connections and mice BSCL buffer gives better throughput for large flows Also reduces variability of per-flow throughputs Loss rate decrease favors large flows in spite of larger queuing delay All smaller flows get worse throughput with the BSCL buffer Increase in queuing delay harms small flows

Closed-loop traffic Similar to persistent traffic case BSCL buffer improves throughput for large flows Also reduces variability of per- flow throughputs Loss rate decrease favors large flows in spite of larger queuing delay All smaller flows get worse throughput with the BSCL buffer Increase in queuing delay harms small flows

Open-loop traffic No significant difference between B=500 and B=1550 Reason: Loss rate for open loop traffic decrease quickly Loss rate for B=500 is already less than 1% Further increase in buffer reduces loss rate to ≈ 0 Large buffer does not increase queuing delays significantly

Summary We derived a buffer sizing formula (BSCL) for congested links that mostly carry TCP traffic Objectives: Full utilization Bounded loss rate Minimum queuing delay, given previous two objectives BSCL formula is applicable for links with more than 80-90% of traffic coming from large and locally bottlenecked TCP flows BSCL accounts for the effects of heterogeneous RTTs and partial loss synchronization Validated BSCL through simulations