NET-REPLAY: A NEW NETWORK PRIMITIVE Ashok Anand Aditya Akella University of Wisconsin, Madison.

Slides:



Advertisements
Similar presentations
The Transmission Control Protocol (TCP) carries most Internet traffic, so performance of the Internet depends to a great extent on how well TCP works.
Advertisements

CCNA3: Switching Basics and Intermediate Routing v3.0 CISCO NETWORKING ACADEMY PROGRAM Switching Concepts Introduction to Ethernet/802.3 LANs Introduction.
Introduction 1 Lecture 13 Transport Layer (Transmission Control Protocol) slides are modified from J. Kurose & K. Ross University of Nevada – Reno Computer.
Principles of Congestion Control Chapter 3.6 Computer Networking: A top-down approach.
Traffic Shaping Why traffic shaping? Isochronous shaping
Estimating TCP Latency Approximately with Passive Measurements Sriharsha Gangam, Jaideep Chandrashekar, Ítalo Cunha, Jim Kurose.
By Arjuna Sathiaseelan Tomasz Radzik Department of Computer Science King’s College London EPDN: Explicit Packet Drop Notification and its uses.
Improving TCP Performance over Mobile Ad Hoc Networks by Exploiting Cross- Layer Information Awareness Xin Yu Department Of Computer Science New York University,
Congestion Control Created by M Bateman, A Ruddle & C Allison As part of the TCP View project.
TCP Congestion Control Dina Katabi & Sam Madden nms.csail.mit.edu/~dina 6.033, Spring 2014.
Router Buffer Sizing and Reliability Challenges in Multicast Aditya Akella 02/28.
CS162 Section Lecture 9. KeyValue Server Project 3 KVClient (Library) Client Side Program KVClient (Library) Client Side Program KVClient (Library) Client.
CacheCast: Eliminating Redundant Link Traffic for Single Source Multiple Destination Transfers Piotr Srebrny, Thomas Plagemann, Vera Goebel Department.
Introduction 1 Lecture 14 Transport Layer (Transmission Control Protocol) slides are modified from J. Kurose & K. Ross University of Nevada – Reno Computer.
Traffic Management - OpenFlow Switch on the NetFPGA platform Chun-Jen Chung( ) SriramGopinath( )
Transport Layer3-1 Congestion Control. Transport Layer3-2 Principles of Congestion Control Congestion: r informally: “too many sources sending too much.
Leveraging Multiple Network Interfaces for Improved TCP Throughput Sridhar Machiraju SAHARA Retreat, June 10-12, 2002.
1 Chapter 3 Transport Layer. 2 Chapter 3 outline 3.1 Transport-layer services 3.2 Multiplexing and demultiplexing 3.3 Connectionless transport: UDP 3.4.
1 Emulating AQM from End Hosts Presenters: Syed Zaidi Ivor Rodrigues.
Reliable Transport Layers in Wireless Networks Mark Perillo Electrical and Computer Engineering.
Error Checking continued. Network Layers in Action Each layer in the OSI Model will add header information that pertains to that specific protocol. On.
TCP/IP Reference Model Host To Network Layer Transport Layer Application Layer Internet Layer.
Ch. 28 Q and A IS 333 Spring Q1 Q: What is network latency? 1.Changes in delay and duration of the changes 2.time required to transfer data across.
Hash, Don’t Cache: Fast Packet Forwarding for Enterprise Edge Routers Minlan Yu Princeton University Joint work with Jennifer.
Layer 2 Switch  Layer 2 Switching is hardware based.  Uses the host's Media Access Control (MAC) address.  Uses Application Specific Integrated Circuits.
What Can IP Do? Deliver datagrams to hosts – The IP address in a datagram header identify a host IP treats a computer as an endpoint of communication Best.
Lect3..ppt - 09/12/04 CIS 4100 Systems Performance and Evaluation Lecture 3 by Zornitza Genova Prodanoff.
Courtesy: Nick McKeown, Stanford 1 TCP Congestion Control Tahir Azim.
CIS 725 Wireless networks. Low bandwidth High error rates.
Spring 2000Nitin BahadurAdvanced Computer Networks A Comparison of Mechanisms for Improving TCP Performance over Wireless Links By: Hari B., Venkata P.
3: Transport Layer3b-1 Principles of Congestion Control Congestion: r informally: “too many sources sending too much data too fast for network to handle”
Brierley 1 Module 4 Module 4 Introduction to LAN Switching.
TCP Throughput Collapse in Cluster-based Storage Systems
1 Chapter 1 OSI Architecture The OSI 7-layer Model OSI – Open Systems Interconnection.
Traffic Management - OpenFlow Switch on the NetFPGA platform Chun-Jen Chung( ) Sriram Gopinath( )
Principles of Congestion Control Congestion: informally: “too many sources sending too much data too fast for network to handle” different from flow control!
CS332, Ch. 26: TCP Victor Norman Calvin College 1.
CSE 461 University of Washington1 Topic How TCP implements AIMD, part 1 – “Slow start” is a component of the AI portion of AIMD Slow-start.
MODULE I NETWORKING CONCEPTS.
CCNA 3 Week 4 Switching Concepts. Copyright © 2005 University of Bolton Introduction Lan design has moved away from using shared media, hubs and repeaters.
Efficient & Robust TCP Stream Normalization Mythili Vutukuru Joint work with Hari Balakrishnan and Vern Paxson.
Copyright 2008 Kenneth M. Chipps Ph.D. Controlling Flow Last Update
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
Department of Computer Science and Engineering Applied Research Laboratory Architecture for a Hardware Based, TCP/IP Content Scanning System David V. Schuehler.
CS640: Introduction to Computer Networks Aditya Akella Lecture 15 TCP – III Reliability and Implementation Issues.
Computer Networking Lecture 18 – More TCP & Congestion Control.
TCP: Transmission Control Protocol Part II : Protocol Mechanisms Computer Network System Sirak Kaewjamnong Semester 1st, 2004.
CS640: Introduction to Computer Networks Aditya Akella Lecture 15 TCP – III Reliability and Implementation Issues.
Winter 2008CS244a Handout 71 CS244a: An Introduction to Computer Networks Handout 7: Congestion Control Nick McKeown Professor of Electrical Engineering.
Ασύρματες και Κινητές Επικοινωνίες Ενότητα # 11: Mobile Transport Layer Διδάσκων: Βασίλειος Σύρης Τμήμα: Πληροφορικής.
TCP continued. Discussion – TCP Throughput TCP will most likely generate the saw tooth type of traffic. – A rough estimate is that the congestion window.
TCP transfers over high latency/bandwidth networks & Grid DT Measurements session PFLDnet February 3- 4, 2003 CERN, Geneva, Switzerland Sylvain Ravot
CCNA3 Module 4 Brierley Module 4. CCNA3 Module 4 Brierley Topics LAN congestion and its effect on network performance Advantages of LAN segmentation in.
© Janice Regan, CMPT 128, CMPT 371 Data Communications and Networking Congestion Control 0.
BUFFALO: Bloom Filter Forwarding Architecture for Large Organizations Minlan Yu Princeton University Joint work with Alex Fabrikant,
TCP over Wireless PROF. MICHAEL TSAI 2016/6/3. TCP Congestion Control (TCP Tahoe) Only ACK correctly received packets Congestion Window Size: Maximum.
Network Models.
Instructor Materials Chapter 6: Quality of Service
COMP 431 Internet Services & Protocols
Click to edit Master subtitle style
Instructor: Mr. Malik Zaib
Lecture 19 – TCP Performance
Network Core and QoS.
IT351: Mobile & Wireless Computing
CS640: Introduction to Computer Networks
CS4470 Computer Networking Protocols
TCP Congestion Control
TCP Overview.
Error Checking continued
Network Core and QoS.
Presentation transcript:

NET-REPLAY: A NEW NETWORK PRIMITIVE Ashok Anand Aditya Akella University of Wisconsin, Madison

Network is a black box 2 Network End hosts Black box view No standardized way for network to inform, where and why glitch occurred  This keeps network simple and efficient  However, end hosts have to resort to complicated logic to infer the nature of glitches Packet lost or delayed

Current tools for locating glitches 3  Probing tools  Tulip[SOSP 2003] sends multiple probes to routers on the path, and use their response to infer the nature of glitches  Issues with such approach  Out-of-band troubleshooting  Probe packets can be treated differently  Transient failures hard to detect

What if network could tell end-hosts about glitches? 4 Network End hosts Where and why glitch occurred  End hosts can take better actions  No need to reduce flow rate, if packet loss was not due to congestion  Route around glitches using alternate routes via multi-homing or recent routing protocol enhancements like path-splicing etc  Benefits emerging applications e.g. gaming, video streaming etc to achieve more robust network performance

Network-assisted troubleshooting 5  Design requirement  Keep the network as simple as possible  While enabling end hosts to determine where and why glitches occurred  Our design: Net-Replay  Router remember the packets they have forwarded  Routers annotate packet (e.g. with their identifier) that they see for the first time  When some glitch occurs, sender replays those packets who had experienced the glitch  Based on annotations, receiver determines the nature of glitch experienced by the original packet

Using Net-Replay to characterize loss 6 Remember the green packet and annotate before forwarding ABC AAAA Re-play the lost packet BB Replayed packet was seen for the first time at B Packet already present at router A Sender Receiver  Receiver infers that packet was dropped at A-B link

Outline 7  Supporting Net-Replay functionality in network  End hosts using Net-Replay  Discussion

Outline 8  Supporting Net-Replay functionality in network  End hosts using Net-Replay  Discussion

Basic support at router 9 Packet Compute hash Hash  Compute hash  Exclude mutable fields (e.g. TTL)  Finding if new packet was already seen by the router  Look-up hash in Hashstore  Remember new packet as seen  Store hash pointing to packet in Hashstore  Evict the oldest packet, if Hashstore becomes full  Simple hash table implementation in DRAM for speeds like 2.4 Gbps  SRAM for higher speed (40 Gbps)  16 MB SRAM currently available Hash Packet Hashstore

High speed Hashstore implementation 10  Use bloom-filters  What about false positives?  Can probabilistically report the location of glitches  What about packet eviction?  Use 2 bloom filters: primary and secondary  When primary is half filled, start using both  When primary is fully filled, copy secondary to primary and clear out secondary  How much time worth packets are stored in 16 MB SRAM?  Up to 3s at 40 Gbps with average packet size of 600 bytes Greater than10 RTTs assuming RTT < 250 ms  Sufficient enough for end-applications to react

Outline 11  Supporting Net-Replay functionality in network  End hosts using Net-Replay  Deployment discussion, cheating Issues and new applications enabled by Net-Replay

How end hosts can use Net-Replay? 12  Characterizing glitches  Packet loss Replay lost packet  Delay Router remembers which packets were delayed Replay delayed packet  Reordering If it happened due to route changes, sender could know the first router where route changed Replay reordered packet

End host protocol stack 13  Higher layer should decide the policy of handling glitches  TCP layer can tell higher layer the nature of glitches e.g. loss  After loss, TCP layer retransmits packet (in current TCP protocols)  Uses retransmitted packet to find the nature of loss  Receiver sends the information about loss back to sender along with ACK TCP layer Application or higher layer Nature of glitches Decide how to overcome glitches MIRO/ Path splicing Or no action

Outline 14  Supporting Net-Replay functionality in network  End hosts using Net-Replay  Discussion

Deployment discussion 15  Partial deployment  Net-replay can be deployed on few routers and can be used to find the nature of glitches in path segments Border routers of ISP and information per domain  Avoiding device modifications  Can be deployed in 2-port hardware switches as bumps in the wire Net-Replay agnostic devices Net-Replay aware bumps in the wire

Other applications 16  Network tomography uses complicated logic to infer link loss rates  With Net-Replay, location of loss can be precisely determined  Simplifies network tomography  Packets can be moved from fast memory to disks in batches  Can be used for debugging distributed applications  Useful for network operators to find the performance at fine grained level

Conclusion 17  Net-Replay helps applications perform in-band characterization of glitches  Net-Replay requires simple support from network infrastructure  End hosts can get robust network performance using Net-Replay

Questions 18  Thank you

Backup 19

Hashstore implementation 20  Simple hash table in DRAM (50 ns latency) good enough at 2.5 Gbps  Lookup and store: 100 ns per packet  40B packets arrive every 128ns at 2.5 Gbps  However DRAM latency can’t match 40Gbps; requires faster memory like SRAM  Current SRAM up to 16MB only; Need space-efficient data structure

Cheating issues 21  ISP inserts wrong annotations to ensure that it is never considered accountable for glitches  Chances are that ISP is caught  ISP modifies ACK packet, if it finds its router is causing glitches  Use encrypted ACK  Possibly other issues and need to investigate