1 Building a connection-oriented internet Outline –What are we doing? - cheetah –Research problems –Engineering problems –Why we are doing this? - vision/motivation.

Slides:



Advertisements
Similar presentations
Martin Suchara, Ryan Witt, Bartek Wydrowski California Institute of Technology Pasadena, U.S.A. TCP MaxNet Implementation and Experiments on the WAN in.
Advertisements

Introduction 2 1: Introduction.
August 10, Circuit TCP (CTCP) Helali Bhuiyan
Spring 2000CS 4611 Introduction Outline Statistical Multiplexing Inter-Process Communication Network Architecture Performance Metrics.
1 A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006 Supported by NSF ITR , NSF EIN , and DOE DE-FG02-04ER25640.
S305 – Network Infrastructure Chapter 5 Network and Transport Layers Part 2.
Ahmed El-Hassany CISC856: CISC 856 TCP/IP and Upper Layer Protocols Slides adopted from: Injong Rhee, Lisong Xu.
End-to-End GMPLS Signaling in CHEETAH Project Xiangfei Zhu 5/5/2005 Master’s Project Presentation.
Enabling New Applications with Optical Circuit-Switched Networks Xuan Zheng April 27, 2004.
Congestion Control Tanenbaum 5.3, /12/2015Congestion Control (A Loss Based Technique: TCP)2 What? Why? Congestion occurs when –there is no reservation.
1 Proposed future direction for CHEETAH Outline What's our goal for the network: eScience network or large-scale GP network? Book-Ahead (BA) or Immediate-Request.
Multiple constraints QoS Routing Given: - a (real time) connection request with specified QoS requirements (e.g., Bdw, Delay, Jitter, packet loss, path.
10 - Network Layer. Network layer r transport segment from sending to receiving host r on sending side encapsulates segments into datagrams r on rcving.
Computer Network Architecture and Programming
Internet and Intranet Protocols and Applications Section V: Network Application Performance Lecture 11: Why the World Wide Wait? 4/11/2000 Arthur P. Goldberg.
Traffic Characterization Dr. Abdulaziz Almulhem. Almulhem©20012 Agenda Traffic characterization Switching techniques Internetworking, again.
Data Communications Architecture Models. What is a Protocol? For two entities to communicate successfully, they must “speak the same language”. What is.
Review on Networking Technologies Linda Wu (CMPT )
5/12/05CS118/Spring051 A Day in the Life of an HTTP Query 1.HTTP Brower application Socket interface 3.TCP 4.IP 5.Ethernet 2.DNS query 6.IP router 7.Running.
1 NSF CHEETAH project “End-To-End Provisioned Optical Network Testbed for Large-Scale eScience Applications” Xuan Zheng & Malathi Veeraraghavan Univ. of.
10/3/991 Interworking IP and WDM Networks Malathi VeeraraghavanMark Karol Polytechnic UniversityLucent Technologies Outline: Provisioned.
Ch. 28 Q and A IS 333 Spring Q1 Q: What is network latency? 1.Changes in delay and duration of the changes 2.time required to transfer data across.
CUNY (NSF Planing Meeting, 11/12/03, Virginia) Circuit-switched High-speed End-to-End Transport arcHitecture (CHEETAH) Cisco MSPP Connection to primary.
1 TCP/IP architecture A set of protocols allowing communication across diverse networks Out of ARPANET Emphasize on robustness regarding to failure Emphasize.
Lecture 1, 1Spring 2003, COM1337/3501Computer Communication Networks Rajmohan Rajaraman COM1337/3501 Textbook: Computer Networks: A Systems Approach, L.
1: Introduction1 Part I: Introduction Goal: r get context, overview, “feel” of networking r more depth, detail later in course r approach: m descriptive.
Process-to-Process Delivery:
A Virtual Circuit Multicast Transport Protocol (VCMTP) for Scientific Data Distribution Jie Li and Malathi Veeraraghavan University of Virginia Steve Emmerson.
Review: – computer networks – topology: pair-wise connection, point-to-point networks and broadcast networks – switching techniques packet switching and.
Switched network.
1 CHEETAH's use of DRAGON DRAGON software (current usage) RSVP-TE for an end-host client VLSR for a CVLSR to support immediate-request calls DRAGON network.
Fundamentals of Computer Networks ECE 478/578 Lecture #19: Transport Layer Instructor: Loukas Lazos Dept of Electrical and Computer Engineering University.
UVA work items  Provisioning across CHEETAH and UltraScience networks Transport protocol for dedicated circuits: Fixed-Rate Transport Protocol (FRTP)
Polytechnic University1 The internetworking solution of the Internet Prof. Malathi Veeraraghavan Elec. & Comp. Engg. Dept/CATT Polytechnic University
Rick Summerhill Chief Technology Officer, Internet2 Internet2 Fall Member Meeting 9 October 2007 San Diego, CA The Dynamic Circuit.
6/1/991 Internetworking connectionless and connection-oriented networks Malathi Veeraraghavan Mark Karol Polytechnic UniversityBell Laboratories
Ch 1. Computer Networks and the Internet Myungchul Kim
A Framework for Internetworking Heterogeneous High-Performance Networks via GMPLS and Web Services Xi Yang, Tom Lehman Information Sciences Institute (ISI)
Computer Networks with Internet Technology William Stallings
High-speed TCP  FAST TCP: motivation, architecture, algorithms, performance (by Cheng Jin, David X. Wei and Steven H. Low)  Modifying TCP's Congestion.
1 End-host Route Selection in the CHEETAH Networking Solution Zhanxiang Huang 05/01/2006 Advisor: Malathi Veeraraghavan Master’s Project Presentation Acknowledgement:
Networking Fundamentals. Basics Network – collection of nodes and links that cooperate for communication Nodes – computer systems –Internal (routers,
William Stallings Data and Computer Communications
1 CHEETAH – a high speed optical network Xiuduan Fang, Tao Li, Mark Eric McGinley, Xiangfei Zhu, and Malathi Veeraraghavan.
1 Circuit switch controller: Routing and signaling Malathi Veeraraghavan University of Virginia Circuit switch –Routing –Signaling Difference in use of.
1 OSI and TCP/IP Models. 2 TCP/IP Encapsulation (Packet) (Frame)
O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 1 Enabling Supernova Computations by Integrated Transport and Provisioning Methods Optimized.
Measuring the Capacity of a Web Server USENIX Sympo. on Internet Tech. and Sys. ‘ Koo-Min Ahn.
Lambda scheduling algorithm for file transfers on high-speed optical circuits Hojun Lee Polytechnic Univ. Hua Li and Edwin Chong Colorado State Univ. Malathi.
1 ECEN “Internet Protocols and Modeling”, Spring 2011 Slide 5.
Transmission Control Protocol (TCP) BSAD 146 Dave Novak Sources: Network+ Guide to Networks, Dean 2013.
Chapter 11.4 END-TO-END ISSUES. Optical Internet Optical technology Protocol translates availability of gigabit bandwidth in user-perceived QoS.
DOE UltraScience Net The Need –DOE large-scale science applications on supercomputers and experimental facilities require high-performance networking Petabyte.
Internet2 Dynamic Circuit Services and Tools Andrew Lake, Internet2 July 15, 2007 JointTechs, Batavia, IL.
Enabling Supernova Computations on Dedicated Channels Malathi Veeraraghavan University of Virginia
1 Building a connection-oriented internet Outline Problem statement CHEETAH: an NSF-funded experimental project Research problems The “internet” name in.
Scheduling and transport for file transfers on high-speed optical circuits Authors: M. Veeraraghavan & Xuan Zheng (University of Virginia) Wu Feng (Los.
1 CHEETAH - CHEETAH – Circuit Switched High-Speed End-to-End Transport ArcHitecture Xuan Zheng, Xiangfei Zhu, Xiuduan Fang, Anant Mudambi, Zhanxiang Huang.
Signaling Transport Options in GMPLS Networks: In-band or Out-of-band Malathi Veeraraghavan & Tao Li Charles L. Brown Dept. of Electrical and Computer.
1 Revision to DOE proposal Resource Optimization in Hybrid Core Networks with 100G Links Original submission: April 30, 2009 Date: May 4, 2009 PI: Malathi.
McGraw-Hill Chapter 23 Process-to-Process Delivery: UDP, TCP Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
6/1/991 Internetworking connectionless and connection-oriented networks Malathi Veeraraghavan Mark Karol Polytechnic UniversityBell Labs.
1 Circuit Switching Outline  Types of switches  Add multiplexers and demultiplexers  TDM circuit switch  Practice: SONET switch Malathi Veeraraghavan.
Network Processing Systems Design
Transport Protocols over Circuits/VCs
End-host Initiated GMPLS Signaling Demo
End-to-End Provisioned Network Testbed for eScience
Process-to-Process Delivery:
CS Lecture 2 Network Performance
Detailed plan - UVA Dynamic circuit setup/release
Presentation transcript:

1 Building a connection-oriented internet Outline –What are we doing? - cheetah –Research problems –Engineering problems –Why we are doing this? - vision/motivation –Hasn't this been attempted before? Malathi Veeraraghavan Univ. of Virginia Talk at Georgia Tech., March. 30, 2005

2 What are we doing? Building a wide-area network called CHEETAH: Circuit-switched High-speed End-to-End Transport ArcHitecture Writing software to run on Linux end hosts to use this network Applications –File transfers –Remote visualization –Web downloads

3 NSF-funded project Participants: –Malathi Veeraraghavan, UVA –Nagi Rao, Bill Wing, Tony Mezzacappa, ORNL –Ibrahim Habib, CUNY –John Blondin, NCSU $3.5M project for three years, Acknowledgment: NSF EIN grant ANI

4 What’s the cheetah network? End hosts with two Ethernet NICs each –Primary NIC connected to the enterprise LAN/Internet –Secondary NIC connected to an MSPP Network nodes are MSPPs –An MSPP is an Ethernet-SONET gateway Solution leverages “king of LANs” (Ethernet) and “king of MANs/WANs (SONET)” Key aspect: Dynamic bandwidth sharing

5 Multi-Service Provisioning Platform (MSPP) 10/100M Ethernet 1Gbps Ethernet Crossconnect (VT1.5 or STS1) OC12/OC48/OC192 SONET card PC WAN access Control An OC1 rate SONET crossconnect with optional Ethernet interface cards Ethernet cards implement GFP to map Ethernet frames into SONET frames (EoS) Sycamore’s SN16000 implements GMPLS protocols

6 GMPLS protocols Triumvirate to build large-scale networks in “plug-and-play” mode –LMP to discover neighbors –OSPF-TE for routing –RSVP-TE for signaling Should be able to create distributed networks with “minimal” admin support

7 Signaling to setup/release Ethernet-EoS-Ethernet circuits Gateways available that can crossconnect a Gigabit Ethernet port to an equivalent-rate time-division or wavelength- division multiplexed signal dynamically Control Gigabit Ethernet interface card Time-division or wavelength-division multiplexing optical interface card Circuit based gateway Circuit based gateway Circuit based gateway Circuit based gateway Circuit based gateway signaling engine: dynamic call setup/release Gigabit Ethernet interfaces to hosts Gigabit Ethernet interfaces to hosts Setup connection (make reservation) Release connection (release resources) Transfer file

8 Holding times of circuits 100ms to few seconds/minutes –upper limit should be imposed for increased sharing; value depends on bandwidth requested –if 1Gbps, upper limit can be a few minutes –if 100Mbps, upper limit can be a couple of hours Apps –File transfer; e.g. 100MB on 1Gbps: 800ms –Remote viz. session: one or two hours at 100Mbps

9 Cheetah software on end hosts DNS query (to check if far end host is also on cheetah) Routing decision to check whether to use the TCP/IP path or a cheetah circuit Signaling client to request a circuit Fixed Rate Transport Protocol (FRTP) designed for circuits

10 Demo #1 (at SC2004): Web Application Web server (MVSTU2) Web client (MVSUT3) At the web server side –Hyperlink to file is a CGI script (download.cgi); filename embedded in hyperlink –Download.cgi is started automatically at server when user clicks hyperlink, which triggers CHEETAH FT sender –CHEETAH FT Sender initiates CHEETAH circuit setup by calling RSVP-TE client. –CHEETAH FT Sender starts data transfer using dual paths: FRTP/circuit and TCP/IP At the web client side –A RSVP-TE client is running as daemon to accept the circuit setup request. –A CHEETAH FT receiver is running as daemon to receive the user data Web Browser (e.g. Mozilla) Web Server (e.g. Apache) download.cgi Data transfer URL Response RSVP-TE Messages RSVP-TE client FRTP TCP CHEETAH FT sender RSVP-TE client FRTP TCP CHEETAH FT receiver

11 File transfers on circuits Seems like a good app. for high bandwidth –Can absorb “any” bandwidth you can allocate to the transfer (subject to PC limitations) –No intrinsic burstiness move bits from one disk to another

12 Better or worse than file transfers on TCP/IP? General thinking: –Circuits good for “large” files –eScience apps create large files –Emission delay of 2.2 hours for a 1TB file on a 1Gbps circuit Not a scalable network solution if used exclusively for such very large files

13 Cheetah solution: Leverage presence of Internet path Use second NICs at hosts for circuit connectivity leaving primary NIC for Internet access Connectionless Internet End host I End host II Circuit-Switched Network Attempt circuit setup If rejected, fall back to using TCP/IP Should we attempt a circuit setup for ALL file transfers? Two paths available

14 Expected delay on TCP/IP path Main factors: –Round-Trip Time (RTT) – main T prop –Prob. of packet loss on IP path, p, –Bottleneck link rate J. Padhye, V. Firoiu, D. Towsley, and J. Kurose, “Modeling TCP Throughput: A Simple Model and its Empirical Validation,” Proc. of ACM SIGCOMM 98, Aug Sep. 4, Vancouver Canada, pp Throughput B(p): approximately reciprocal of expected delay Other terms: –W max : receiver window size –b= 2 (ACK-every-other- segment) –T 0 : initial time-out

15 Mean TCP delays 1.Input parameters plus the time to transfer a 1GB file and a 1TB file Loss Round- trip prop. delay Impact of propagation delay Low impact of bottleneck link rate in wide-area networks Impact of packet loss rate

16 Delays incurred in using an end-to-end circuit Circuit setup delay + File transfer delay m sig : message length; r s : signaling link rate Loads:  sig and  sp : sig. link and processor T sp : signaling protocol processing delay k: number of switches; T prop : r.t. prop. delay f: file size r c : circuit rate Acknowledgment: NSF ANI grant for hardware signaling

17 Should the application attempt a circuit setup or not? Mean delay if a circuit setup is attempted P b : call blocking probability in the circuit-switched network If circuit setup fails, fall back to Internet path

18 Routing decision

19 Numerical results link rate = 1Gbps T prop = 0.1ms T prop = 50ms

20 When r c = 100Mbps and T prop = 0.1ms Crossover file sizes r c = 1Gbps, T prop = 0.1ms

21 Utilization considerations Example: in 50ms scenario, if we transfer a 100KB file over a 100Mbps path, transfer time is only 8ms. Circuit utilization is 8/(50+8) = 13.7% Two opposing factors –If the crossover file size (beyond which circuit setup is attempted) is increased per-circuit utilization increases traffic load decreases (Pareto distribution of file sizes), which means aggregate utilization decreases

22 Aggregate utilization u a  : traffic load m: number of circuits P b : call blocking probability  m uaua For a 1% call blocking probability P b = % 58.2% 84.6% Assuming file size follows Pareto distribution –Define fractional offered load  (fraction of  ) 40KB81% 330KB71% 80MB51% 

23 Plot of utilization u with r c = 100Mbps, k=20 P b =0.3P b =0.01

24 Cheetah network deployment Control card OC192 card GbE/ 10GbE card GbE/10GbE Ethernet Switch To Cray ORNL Circuit based gateway To DC – Dragon Atlanta NLR WDM GaTech WDM GaTech WDM ORNL WDM NLR SOX/SLR NC GbE/10GbE Ethernet Switch To cluster computer NCSU MCNC/NLR Circuit-based gateway OC192 card Control card GbE/ 10GbE card OC192 card 10GbE OC192 (10 Gbps) 10 Gbps G. Tech SLR OC192 card Control card GbE/ 10GbE card OC192 card SLR

25 Connecting Cheetah to Dragon and Ultrascience networks Dragon Cheetah DOE Ultrascience network (ORNL) Acknowledgment: DOE grant

26 All this is fun, but What are the research problems? –Bandwidth sharing modes Low load performance Scheduled vs. immediate-request Fairness –Mismatch between multitasking end hosts and TDM circuits

27 Fixing the bandwidth for the transfer could be a bad thing: low load problem Varying bandwidth list scheduling algorithm –uses knowledge of file size to make varying bandwidth allocations for transfer –catch: requires circuit switches to be reprogrammed multiple times within lifetime of a transfer (circuit) Capacity C Packet Switch N 2 3 Each transfer gets C/N capacity N The lone remaining transfer enjoys full capacity C Capacity C Circuit Switch N 2 3 Each transfer is allocated C/N capacity N The lone remaining transfer continues with capacity allocation C/N

28 Scheduled vs. immediate-request calls Session type requests: long holding times (2 hours) specific rate remote visualizations scientists participate in sessions best served with an advance reservation File transfer requests: file sizes provided not holding times max rate specified but any rate can be allocated scientists not involved; just computers Small files (e.g. 1 GB on 1 Gbps takes 8 sec) should be handled in immediate-request mode Large files (e.g. 1 TB on 1 Gbps takes 2.2 hours) should be handled in scheduled mode should we allocate 10Gbps and finish in 800 sec? immediate-request? or scheduled? depends on m, the number of 10Gbps circuits

29 Fairness Call admission algorithms –Use Markov Decision Process (MDP) tools to balance fairness and overall throughput –Long-path and short-path calls –Large files (high-BW) and short files (low-BW) calls –Multi-level answer rather than binary accept/reject Both with Fixed bandwidth and Varying bandwidth

30 Multi-level problem Perhaps a new problem? –Real-time (interactive) audio-video applications generate data at a certain rate (constant or variable) implication: application requests the required bandwidth from the network, and answer is binary (accept or reject); multiple classes –File transfers: “any” bandwidth that the network can provide could be acceptable implication: application requests a MAX bandwidth, but the answer can be multi-level

31 Mismatch between multitasking end hosts and TDM circuits Variability in sender: –other processes (e.g. matlab) + disk access (disk head location) Variability in receiver: if buffer not emptied out, data loss occurs Network protocols network card network card Filesystem File transfer Matlab kernel user space Circuit-switched network Network protocols Filesystem File transfer Matlab

32 Effects of mismatch in nature of circuits and nature of hosts Choose a high circuit rate and receive buffer can overflow causing losses –impacts delay + utilization (retransmissions) Choose a low circuit rate and delay can be high If sending rate is not matched exactly with circuit rate –circuit lies idle; utilization impacted

33 Fixed Rate Transport Protocol (FRTP) Set up a circuit at a carefully chosen rate Send data at that rate –hard to meter out data at a fixed rate from a multitasking sender when that rate is high (Linux system time granularity: 10ms) No changes of sending rate –i.e., no flow control or congestion control Packet losses recovered through retransmissions –no timers needed, just negative ACKs because of in-sequence delivery

34 Experimental results RELATIVE TRANSFER DELAY CIRCUIT UTILIZATION (%) CIRCUIT RATE (Mbps)

35 Current work Experimenting with RT schedulers to schedule file transfer task in a set rhythm Experimenting with file systems to characterize file write time to collect data to then determine circuit rate and receive buffer size

36 Engineering problems Need to use VLAN based switches between end hosts and MSPPs –Costly otherwise VLSR: Virtual Label Switch Router –External GMPLS controller for Ethernet (VLAN) switches Understood need for making it a connection-oriented internetwork Acknowledgment: DOE grant

37 Connection-oriented networks Circuit switched –Time Division Multiplexed (SONET) Equipment vendors: Sycamore, Ciena Network: Cheetah, UltraScience Net, CA*net 4 –Wavelength Division Multiplexed (WDM) Equipment vendors: Movaz, Calient, LambdaOptical Network: Dragon, OMNInet, Internet2 HOPI

38 Connection-oriented networks Packet switched –Multiprotocol Label Switching (MPLS) Equipment vendors: Cisco, Juniper Network: Internet2, ESnet –Virtual Local Area Network (VLAN) Equipment vendors: Dell, Intel, Foundry, Extreme Network: Enterprise local area networks Just need to “enable” connection-oriented network through already deployed boxes

39 Bandwidth sharing problem in heterogeneous network Problem: –Tradeoff of fairness and utilization becomes more difficult when these crossconnect granularities are considered a c d b e f 2, 30Mbps 5, 100Mbps 1, 150Mbps 1, 10Mbps 1, 50Mbps 1, 500Mbps 2, 50Mbps 1, 50Mbps Request for 30Mbps connection 1Mbps 51Mbps 10Gbps Switch granularity

40 Interconnecting these networks Tricky business! Involves many levels of interworking protocols –User (data) plane –Signaling protocols (for connection setup/release) –Routing protocols (for reachability, topology, loading data dissemination)

41 But We need to solve this internetworking problem for a true connection-oriented service to flourish! Acknowledgment: DOE grant

42 Why do this? Two simple views –Purpose of a communication link, and by extension, a communication network –Analogy with transportation modes

43 Why do this? View 1: Purpose of a communication link and by extension a communication network –To provide connectivity between a data sending entity and a data receiving entity –Quantify connectivity bandwidth is a primary measure –Shouldn’t we have a network that provides users specific bandwidth levels as requested, and when requested, on a dime?

44 Why do this? View 2: Analogy with people/goods transportation modes –unreserved travel: roadways –reserved travel: airline seat So why not at least two such networks for moving data?

45 Would anyone use it? Don’t know Depends on the business case –What’s the cost of building this network? –What’s the market? –Can the service price be set to turn a profit, i.e., to let companies survive?

46 Hasn't this been attempted before? ATM-to-the-desktop –Goal: to enable an end-to-end connection-oriented service –It was a homogeneous network – all ATM switches –Recognition of need to interwork with IP LANE, MPOA Soon morphed into ATM networks offering connection- oriented service to interconnect routers NOT end hosts –Application focus: mostly multimedia delay-sensitive but “low” bandwidth could be supported with simple priority queueing added to connectionless packet switches First difference: aiming for a heterogeneous internet using already deployed switches and gateways

47 Second difference Ipsilon’s IP switch –Flow classification at “airport” to trigger connection setup –Questions of scalability – notion of having to hold “state” information for millions of flows No, just the ones who requested bandwidth airport Call to make a reservation (if only for part of the distance: airport-to-airport) CL network CO network

48 Summary Rich new set of research problems Experimental challenges a plenty! Real opportunity to deploy a CO internetwork Web site: