Optimizing UDP-based Protocol Implementations Yunhong Gu and Robert L. Grossman Presenter: Michal Sabala National Center for Data Mining.

Slides:



Advertisements
Similar presentations
The Transmission Control Protocol (TCP) carries most Internet traffic, so performance of the Internet depends to a great extent on how well TCP works.
Advertisements

The LAC/UIC experiences through JGN2/APAN during SC04 Katsushi Kouyama and Kazumi Kumazoe Kitakyushu JGN Research Center / NiCT Robert L. Grossman, Yunhong.
Yunhong Gu & Robert Grossman University of Illinois at Chicago
August 10, Circuit TCP (CTCP) Helali Bhuiyan
R2: An application-level kernel for record and replay Z. Guo, X. Wang, J. Tang, X. Liu, Z. Xu, M. Wu, M. F. Kaashoek, Z. Zhang, (MSR Asia, Tsinghua, MIT),
CS 443 Advanced OS Fabián E. Bustamante, Spring 2005 Resource Containers: A new Facility for Resource Management in Server Systems G. Banga, P. Druschel,
Camarillo / Schulzrinne / Kantola November 26th, 2001 SIP over SCTP performance analysis
Restricted Slow-Start for TCP William Allcock 1,2, Sanjay Hegde 3 and Rajkumar Kettimuthu 1,2 1 Argonne National Laboratory 2 The University of Chicago.
Chapter 7 Protocol Software On A Conventional Processor.
TCP friendlyness: Progress report for task 3.1 Freek Dijkstra Antony Antony, Hans Blom, Cees de Laat University of Amsterdam CERN, Geneva 25 September.
ECE 526 – Network Processing Systems Design Software-based Protocol Processing Chapter 7: D. E. Comer.
VIA and Its Extension To TCP/IP Network Yingping Lu Based on Paper “Queue Pair IP, …” by Philip Buonadonna.
1 Soft Timers: Efficient Microsecond Software Timer Support For Network Processing Mohit Aron and Peter Druschel Rice University Presented By Jonathan.
An Integrated Framework for Dependable Revivable Architectures Using Multi-core Processors Weiding Shi, Hsien-Hsin S. Lee, Laura Falk, and Mrinmoy Ghosh.
3.5 Interprocess Communication Many operating systems provide mechanisms for interprocess communication (IPC) –Processes must communicate with one another.
3.5 Interprocess Communication
Internet and Intranet Protocols and Applications Section V: Network Application Performance Lecture 11: Why the World Wide Wait? 4/11/2000 Arthur P. Goldberg.
Department of Electronic Engineering City University of Hong Kong EE3900 Computer Networks Transport Protocols Slide 1 Transport Protocols.
Why Threads Are A Bad Idea (for most purposes) John Ousterhout Sun Microsystems Laboratories
Lect3..ppt - 09/12/04 CIS 4100 Systems Performance and Evaluation Lecture 3 by Zornitza Genova Prodanoff.
Udt.sourceforge.net 1 :: 50 BREAKING THE DATA TRANSFER BOTTLENECK Yunhong GU National Center for Data Mining University of Illinois at Chicago.
ISO Layer Model Lecture 9 October 16, The Need for Protocols Multiple hardware platforms need to have the ability to communicate. Writing communications.
Udt.sourceforge.net 1 :: 50 BREAKING THE DATA TRANSFER BOTTLENECK Yunhong GU Laboratory for Advanced Computing National Center for Data.
Experiences in Design and Implementation of a High Performance Transport Protocol Yunhong Gu, Xinwei Hong, and Robert L. Grossman National Center for Data.
High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University Piyush Shivam Ohio State University.
Profiling Grid Data Transfer Protocols and Servers George Kola, Tevfik Kosar and Miron Livny University of Wisconsin-Madison USA.
MIDeA :A Multi-Parallel Instrusion Detection Architecture Author: Giorgos Vasiliadis, Michalis Polychronakis,Sotiris Ioannidis Publisher: CCS’11, October.
CS533 Concepts of Operating Systems Jonathan Walpole.
UDT: UDP based Data Transfer Yunhong Gu & Robert Grossman Laboratory for Advanced Computing University of Illinois at Chicago.
UDT: UDP based Data Transfer Protocol, Results, and Implementation Experiences Yunhong Gu & Robert Grossman Laboratory for Advanced Computing / Univ. of.
UDT as an Alternative Transport Protocol for GridFTP Raj Kettimuthu Argonne National Laboratory The University of Chicago.
CSE679: Computer Network Review r Review of the uncounted quiz r Computer network review.
Srihari Makineni & Ravi Iyer Communications Technology Lab
Optimised Memory Transfer & Flow Control for High Speed Networks - Codito Technologies Pvt. Ltd. - D Y Patil College of Engineering.
UDT UDT Bo Liu 11/1/2012 Inspired by Yunhong GU. OUTLINE Goal of UDT Three conditions Congestion control of UDT UDT Format Composable UDT.
A Quality of Service Architecture that Combines Resource Reservation and Application Adaptation Ian Foster, Alain Roy, Volker Sander Report: Fu-Jiun Lu.
HighSpeed TCP for High Bandwidth-Delay Product Networks Raj Kettimuthu.
Rate Control Rate control tunes the packet sending rate. No more than one packet can be sent during each packet sending period. Additive Increase: Every.
Udt.sourceforge.net 1 :: 23 Supporting Configurable Congestion Control in Data Transport Services Yunhong Gu and Robert L. Grossman Laboratory for Advanced.
Pavel Cimbál, Sven Ubik CESNET TNC2005, Poznan, 9 June 2005 Tools for TCP performance debugging.
Masaki Hirabaru NICT Koganei 3rd e-VLBI Workshop October 6, 2004 Makuhari, Japan Performance Measurement on Large Bandwidth-Delay Product.
An Introduction to UDT Internet2 Spring Meeting Yunhong Gu Robert L. Grossman (Advisor) National Center for Data Mining University.
VTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core Embedded Lab. Kim Sewoog Cong Xu, Sahan Gamage, Hui Lu, Ramana Kompella,
ND The research group on Networks & Distributed systems.
1 Soft Timers: Efficient Microsecond Software Timer Support For Network Processing Mohit Aron and Peter Druschel Rice University Presented By Oindrila.
LRPC Firefly RPC, Lightweight RPC, Winsock Direct and VIA.
Prentice HallHigh Performance TCP/IP Networking, Hassan-Jain Chapter 13 TCP Implementation.
Chapter 11.4 END-TO-END ISSUES. Optical Internet Optical technology Protocol translates availability of gigabit bandwidth in user-perceived QoS.
TCP transfers over high latency/bandwidth networks & Grid DT Measurements session PFLDnet February 3- 4, 2003 CERN, Geneva, Switzerland Sylvain Ravot
UDP File Transfer Nathan Kiel CSE434. Goal Explore difficulties of UDP transport in a file transfer application Direct experience by writing an FTP style.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Major OS Components CS 416: Operating Systems Design, Spring 2001 Department of Computer Science Rutgers University
Increasing TCP's CWND based on Throughput draft-you-iccrg-throughput-based-cwnd-increasing-00 Jianjie You IETF92 Dallas.
An Analysis of AIMD Algorithm with Decreasing Increases Yunhong Gu, Xinwei Hong, and Robert L. Grossman National Center for Data Mining.
CSCI/CMPE 4334 Operating Systems Review: Exam 1 1.
CIS679: UDP and Multimedia r Review of last lecture r UDP and multimedia.
Advisor: Hung Shi-Hao Presenter: Chen Yu-Jen
Dzmitry Kliazovich, Fabrizio Granelli, University of Trento, Italy
Dzmitry Kliazovich, Fabrizio Granelli, University of Trento, Italy
Distributed Network Traffic Feature Extraction for a Real-time IDS
Transport Protocols over Circuits/VCs
SCTP v/s TCP – A Comparison of Transport Protocols for Web Traffic
CS 258 Reading Assignment 4 Discussion Exploiting Two-Case Delivery for Fast Protected Messages Bill Kramer February 13, 2002 #
TCP over SoNIC Abhishek Kumar Maurya
Operating Systems Lecture 1.
Why Threads Are A Bad Idea (for most purposes)
Why Threads Are A Bad Idea (for most purposes)
Why Threads Are A Bad Idea (for most purposes)
A Protocol Reconfiguration Framework with Autonomic Aspects
Presentation transcript:

Optimizing UDP-based Protocol Implementations Yunhong Gu and Robert L. Grossman Presenter: Michal Sabala National Center for Data Mining

Outline UDP Performance Characteristics and Optimizations Composable UDT: A Framework for UDP- based Protocol Implementations

Part I. UDP Performance Characteristics and Optimization Techniques

Introduction UDP-based Protocol is needed –As short-term solution to the lack of effective kernel space transport protocols for high bandwidth-delay product networks –As application specific data transfer library, e.g., Multimedia data transfer It is not an easy task to impalement a new UDP-based protocol from scratch –And may be not necessary!

UDP Performance Sending and receiving buffer size Packet size IO mode –Scattering/gathering (writev/readv) –Memory copy avoidance (e.g., overlapped IO of Windows Socket2) To reach same data transfer rate, UDP needs slightly less CPU time than TCP, and cause slightly less end system delay

UDP Performance: Impact of Buffer Size

UDP Performance: Impact of Packet Size Throughput CPU Util.

UDP-based Protocol Performance Additional overhead –Additional memory copy –Additional packet processing –Additional context switches

Optimization Guidelines Avoid additional memory copy Reduce the number of packets –Control packets, esp. acknowledgements Reduce overall processing time –Simpler mechanism is better Avoid burst in processing time –CPU may be too busy to process incoming packets

Optimization Guidelines Memory copy avoidance –UDP IO –API semantics Acknowledgements –Timer-based Acknowledging –Light ACK –Loss processing Timing, rate control, and self-clocking

Optimization Guidelines Disk IO –sendfile/recvfile Threading –Synchronization cost Code Optimization –sending/receiving loop Profiling

Part II. Composable UDT: A Framework for UDP-based Protocol Implementations

Composable UDT Based on the UDT (UDP-based Data Transfer library) implementation Integrated those optimization techniques described in this paper

Objectives Rapid development of UDP-based transport protocols and application specific data transfer libraries Easy evaluation of new congestion control algorithms Non-objectives –Replace kernel space protocol implementations –User-level TCP implementation

Current Status UDT/CCC: Configurable congestion control In future –Data reliability configuration –Message boundary support

Configurable Congestion Control Packet sending control –Rate-based, window-based, hybrid Redefinition of control event handlers –Loss, ACK, Time Out, etc. Access to internal protocol parameters –RTT, RTO, Loss Rate, etc. User customized packet formats

Implementation C++ class inheritance –CCC: base class for control event handing Callbacks Performance monitoring –Internal protocol parameters –Performance statistics

Implementation

Example: Simplified TCP class CTCP: public CCC { public: virtual void init() { m_dPktSndPeriod = 0.0; m_dCWndSize = 2.0; setACKInterval(2); } virtual void onACK(const int&) { m_dCWndSize += 1.0/m_dCWndSize; } virtual void onLoss(const int*, const int&) { m_dCWndSize *= 0.5; } };

Configurable Congestion Control

Future Work Continue to improve the UDT/CCC library More experimental evaluation work of the UDT/CCC library –Compare k-TCP and u-TCP in more network environments –Implement more TCP variants More pre-implemented congestion control algorithms

Conclusion UDP-based protocol is one of the solutions to bulk data transfer in high BDP networks Some optimization principles and techniques are discussed in this paper We further propose a composable framework in order to make it much easier to implement UDP-based protocols

Thank you! For more information, please visit UDT Project: NCDM:

Backup Slides

UDP Performance: Experiment Setup NameCPUMemoryNICOS onno Dual Itanium2 1.5GHz 8 GB10 GbELinux sara77 Dual Xeon 2.4GHz 2 GB1 GbELinux ncdm171 Dual PowerPC G4 1GHz 2 GB1 GbEMac OS X win91 Dual Xeon 2.4GHz 2 GB1 GbE Windows XP Professional ncdm87 Dual Opteron 2.4GHz 4 GB1 GbELinux 2.6.8

UDP Performance: CPU Utilization Name UDPTCP SendingReceivingSendingReceiving onno sara ncdm win ncdm

UDP Performance: End System Delay Name UDPTCP Delay (ms) onno sara ncdm win ncdm

UDT Profiling: Modules

UDT Profiling: Functionalities

CPU Utilization: K-TCP vs U-TCP Machines SenderReceiver K-TCPU-TCPK-TCPU-TCP onno sara ncdm win ncdm