Development of network-aware operating systems Tom Dunigan

Slides:



Advertisements
Similar presentations
Appropriateness of Transport Mechanisms in Data Grid Middleware Rajkumar Kettimuthu 1,3, Sanjay Hegde 1,2, William Allcock 1, John Bresnahan 1 1 Mathematics.
Advertisements

Using NetLogger and Web100 for TCP analysis Data Intensive Distributed Computing Group Lawrence Berkeley National Laboratory Brian L. Tierney.
ORNL Net100 status July 31, UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory ORNL Net100 Focus Areas (first year) –TCP optimizations.
1 TCP Congestion Control. 2 TCP Segment Structure source port # dest port # 32 bits application data (variable length) sequence number acknowledgement.
Restricted Slow-Start for TCP William Allcock 1,2, Sanjay Hegde 3 and Rajkumar Kettimuthu 1,2 1 Argonne National Laboratory 2 The University of Chicago.
Maximizing End-to-End Network Performance Thomas Hacker University of Michigan October 5, 2001.
Profiling Network Performance in Multi-tier Datacenter Applications
Chapter 3 Transport Layer slides are modified from J. Kurose & K. Ross CPE 400 / 600 Computer Communication Networks Lecture 12.
Transport Layer 3-1 Fast Retransmit r time-out period often relatively long: m long delay before resending lost packet r detect lost segments via duplicate.
Transport Layer3-1 Congestion Control. Transport Layer3-2 Principles of Congestion Control Congestion: r informally: “too many sources sending too much.
High-performance bulk data transfers with TCP Matei Ripeanu University of Chicago.
1 Chapter 3 Transport Layer. 2 Chapter 3 outline 3.1 Transport-layer services 3.2 Multiplexing and demultiplexing 3.3 Connectionless transport: UDP 3.4.
Data Communication and Networks
Transport Level Protocol Performance Evaluation for Bulk Data Transfers Matei Ripeanu The University of Chicago Abstract:
NDT Tools Tutorial: How-To setup your own NDT server Rich Carlson Summer 04 Joint Tech July 19, 2004.
Introduction 1 Lecture 14 Transport Layer (Congestion Control) slides are modified from J. Kurose & K. Ross University of Nevada – Reno Computer Science.
TCP: flow and congestion control. Flow Control Flow Control is a technique for speed-matching of transmitter and receiver. Flow control ensures that a.
Courtesy: Nick McKeown, Stanford 1 TCP Congestion Control Tahir Azim.
Transport Layer 4 2: Transport Layer 4.
Transport Layer3-1 Chapter 3 outline r 3.1 Transport-layer services r 3.2 Multiplexing and demultiplexing r 3.3 Connectionless transport: UDP r 3.4 Principles.
Transport Layer3-1 Chapter 3 outline 3.1 Transport-layer services 3.2 Multiplexing and demultiplexing 3.3 Connectionless transport: UDP 3.4 Principles.
Maximizing End-to-End Network Performance Thomas Hacker University of Michigan October 26, 2001.
Principles of Congestion Control Congestion: informally: “too many sources sending too much data too fast for network to handle” different from flow control!
UDT: UDP based Data Transfer Yunhong Gu & Robert Grossman Laboratory for Advanced Computing University of Illinois at Chicago.
UDT: UDP based Data Transfer Protocol, Results, and Implementation Experiences Yunhong Gu & Robert Grossman Laboratory for Advanced Computing / Univ. of.
1 Project Goals Project Elements Future Plans Scheduled Accomplishments Project Title: Net Developing Network-Aware Operating Systems PI: G. Huntoon,
High-speed TCP  FAST TCP: motivation, architecture, algorithms, performance (by Cheng Jin, David X. Wei and Steven H. Low)  Modifying TCP's Congestion.
HighSpeed TCP for High Bandwidth-Delay Product Networks Raj Kettimuthu.
1 BWdetail: A bandwidth tester with detailed reporting Masters of Engineering Project Presentation Mark McGinley April 19, 2007 Advisor: Malathi Veeraraghavan.
NET100 Development of network-aware operating systems Tom Dunigan
UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Net100 PIs: Wendy Huntoon/PSC, Tom Dunigan/ORNL, Brian Tierney/LBNL Impact and Connections.
Network-aware OS DOE/MICS Project Review August 18, 2003 Tom Dunigan Matt Mathis Brian Tierney
Pavel Cimbál, Sven Ubik CESNET TNC2005, Poznan, 9 June 2005 Tools for TCP performance debugging.
NET100 … as seen from ORNL Tom Dunigan November 8, 2001.
Iperf Quick Mode Ajay Tirumala & Les Cottrell. Sep 12, 2002 Iperf Quick Mode at LBL – Les Cottrell & Ajay Tirumala Iperf QUICK Mode Problem – Current.
NET100 Development of network-aware operating systems Tom Dunigan
National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Basil Irwin & George Brett.
An Introduction to UDT Internet2 Spring Meeting Yunhong Gu Robert L. Grossman (Advisor) National Center for Data Mining University.
Network-aware OS DOE/MICS Project Final Review September 16, 2004 Tom Dunigan Matt Mathis Brian Tierney ORNL.
Transport Layer 3-1 Chapter 3 Transport Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March
Web100/Net100 at Oak Ridge National Lab Tom Dunigan August 1, 2002.
Transport Layer3-1 Chapter 3 outline r 3.1 Transport-layer services r 3.2 Multiplexing and demultiplexing r 3.3 Connectionless transport: UDP r 3.4 Principles.
Transport Layer 3- Midterm score distribution. Transport Layer 3- TCP congestion control: additive increase, multiplicative decrease Approach: increase.
Thoughts on the Evolution of TCP in the Internet (version 2) Sally Floyd ICIR Wednesday Lunch March 17,
Web100 Basil Irwin National Center for Atmospheric Research Matt Mathis Pittsburgh Supercomputing Center Halloween, 2000.
Winter 2008CS244a Handout 71 CS244a: An Introduction to Computer Networks Handout 7: Congestion Control Nick McKeown Professor of Electrical Engineering.
Advance Computer Networks Lecture#09 & 10 Instructor: Engr. Muhammad Mateen Yaqoob.
NET100 Development of network-aware operating systems Tom Dunigan
TCP transfers over high latency/bandwidth networks & Grid DT Measurements session PFLDnet February 3- 4, 2003 CERN, Geneva, Switzerland Sylvain Ravot
UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Net100: developing network-aware operating systems New (9/01) DOE-funded (Office of.
Final EU Review - 24/03/2004 DataTAG is a project funded by the European Commission under contract IST Richard Hughes-Jones The University of.
Network-aware OS ESCC Miami February 5, 2003 Tom Dunigan Matt Mathis Brian Tierney
Network-aware OS DOE/MICS Project Review August 18, 2003 Tom Dunigan Matt Mathis Brian Tierney CSM lunch.
UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Net100 year 1 leftovers (proposal): PSC –none ORNL –router access to SNMP data (besides.
Network-aware OS DOE/MICS ORNL site visit January 8, 2004 ORNL team: Tom Dunigan, Nagi Rao, Florence Fowler, Steven Carter Matt Mathis Brian.
@Yuan Xue A special acknowledge goes to J.F Kurose and K.W. Ross Some of the slides used in this lecture are adapted from their.
A TCP Tuning Daemon SC2002 November 19, 2002 Tom Dunigan Matt Mathis Brian Tierney
@Yuan Xue A special acknowledge goes to J.F Kurose and K.W. Ross Some of the slides used in this lecture are adapted from their.
Chapter 3 outline 3.1 transport-layer services
Chapter 6 TCP Congestion Control
Chapter 3 outline 3.1 Transport-layer services
Transport Protocols over Circuits/VCs
Monkey See, Monkey Do A Tool for TCP Tracing and Replaying
Chapter 6 TCP Congestion Control
Transport Layer: Congestion Control
Chapter 3 outline 3.1 Transport-layer services
TCP flow and congestion control
Anant Mudambi, U. Virginia
Using NetLogger and Web100 for TCP analysis
Presentation transcript:

Development of network-aware operating systems Tom Dunigan

UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Net100: developing network-aware operating systems New (9/01) DOE-funded (Office of Science) project ($1M/yr, 3 yrs) Principal investigators –Matt Mathis, PSC ( ) –Brian Tierney, LBNL ( ) –Tom Dunigan, ORNL ( ) Objective: –measure and understand end-to-end network and application performance –tune network applications (grid and bulk transfer) Components (leverage Web100) –Network Tool Analysis Framework (NTAF) tool design and analysis active network probes and passive sensors network metrics data base –transport protocol analysis –tuning daemon (WAD) to tune network flows based on network metrics

UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Net100: applied Web100 Web100 –Linux 2.4 kernel mods –100+ TCP variables per flow Net100 –Add Web100 to iperf/ttcp –Monitoring/tuning daemon Java applet bandwidth/client tester –fake WWW server provides html and applet –applet connects to bwserver 3 sockets (control, bwin, bwout) server reports Web100 variables to applet (window sizes, losses, RTT) – Try it

UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Net100 concept (year 1) Path characterization (NTAF) –both active and passive measurement –data base of measurement data Application tuning (tuning daemon, WAD) –Work around network problems –daemon tunes application at start up static tuning information query data base and calculate optimum TCP parameters –dynamically tune application (Web100 feedback) recalculate parameters during flow split optimum among parallel flows Transport protocol optimizations –what to tune? –is it fair? stable?

UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Net100: tuning Work-around Daemon (WAD) Version 0 –tune unknowing sender/receiver at startup –config file with static tuning data {src, srcport, dst, dstport, window} –LBL has python version expression-based tuning To be done –“applying” measurement info –tune more than window size? –communicating WADs –dynamic tuning

UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Example WAD Usage Ability to manipulate Web100 variables based on other Web100 variables: RcvbufSet = (( PktsRetrans + PktsOut - ( 0.1 * PktsOut ) ) / (PktsOut + 1 ) ) * Rcvbuf buffer_size = MinRwinRcvd Ability to generate and log derived events: derived_event: BW=(DataBytesOut*8)/(SndLimTimeRwin+ SndLimTimeCwnd+SndLimTimeSender) –uses NetLogger to send events to archive or for real-time analysis Ability to tune parallel streams (make them fairer?) –buffer size per stream = optimal buffer size for 1 stream / number of parallel streams WAD-to-WAD control channel –Receiver WAD sends tuning data to transmitter WAD

UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory TCP 101 TCP robust over 20 years –reliable/stable/fair –need window = bandwidth*delay ORNL/NERSC (80 ms, OC12) need 6 MB Changing : bandwidths –9.6 Kbs… 1.5 Mbs..45 …100…1000…? Mbs Unchanging : –speed of light (RTT) –MTU (still 1500 bytes) –TCP congestion avoidance TCP is lossy by design ! –2x overshoot at startup, sawtooth –recovery after a loss can be very slow on today’s high delay/bandwidth links –Recovery proportional to MSS/RTT 2 Linear recovery at 0.5 Mb/s! Instantaneous bandwidth Average bandwidth Early startup losses

UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Net100 tuning Avoid losses –use “optimal” buffer sizes determined from network measurements –ECN capable routers/hosts –TCP Vegas –reduce bursts Faster recovery –bigger MSS (jumbo frames) –speculative recovery (D-SACK) –modified congestion avoidance (AIMD) –TCP Westwood Autotune (WAD variables) –Buffer sizes –Dupthresh (reordering resilience) –Del ACK, Nagle –aggressive AIMD –Virtual MSS –initial window, ssthresh –apply only to designated flows/paths non-TCP solutions (rate-based, ?) (tests with TCP-over-UDP, atou, NERSC to ORNL)

UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Net100 status Completed –network probes at ORNL, PSC, NCAR, LBL, NERSC –preliminary schema for network data –initial Web100 sensor daemon and tuning daemons –integration of DRS and Web100 (proof of principle) In progress –TCP tuning extensions to Linux/Web100 kernel –analysis of TCP tuning options –deriving tuning info from network measurements –tuning parallel flows and gridFTP Future –interactions with other network measurement sources –multipath/parallel path selection/tuning

UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Net100 and ESnet GigE jumboframe experiments ECN experiments –Supported by Linux –Instrumented by Web100 drop-tail vs RED experiments SNMP path data –where are losses occurring? –what kind of losses? –SNMP mirrors (MRTG)