1 ESnet - Connecting the USA DOE Labs to the World of Science Eli Dart, Network Engineer Network Engineering Group Chinese American Network Symposium Indianapolis,

Slides:



Advertisements
Similar presentations
An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations W. Wu, P. DeMar, A. Bobyshev Fermilab CHEP 2010, TAIPEI TAIWAN
Advertisements

1 Chin Guok ESnet Network Engineer David Robertson DSD Computer Software Engineer Lawrence Berkeley National Laboratory.
1 Chin Guok ESnet Network Engineer David Robertson DSD Computer Software Engineer Lawrence Berkeley National Laboratory.
1 Chapter 8 Local Area Networks - Internetworking.
Inside the Internet. INTERNET ARCHITECTURE The Internet system consists of a number of interconnected packet networks supporting communication among host.
ESnet On-demand Secure Circuits and Advance Reservation System (OSCARS) Chin Guok Network Engineering Group Thomas Ndousse Visit February Energy.
1 ESnet Update Summer 2007 Joint Techs Workshop Joe Burrescia ESnet General Manager July 16,2007 Energy Sciences Network Lawrence Berkeley National Laboratory.
ATLAS Tier 2 Paths Within ESnet Mike O’Connor ESnet Network Engineering Group Lawrence Berkeley National Lab
IRNC Special Projects: IRIS and DyGIR Eric Boyd, Internet2 October 5, 2011.
1 October 20-24, 2014 Georgian Technical University PhD Zaza Tsiramua Head of computer network management center of GTU South-Caucasus Grid.
1 ESnet Planning for the LHC T0-T1 Networking William E. Johnston ESnet Manager and Senior Scientist Lawrence Berkeley National Laboratory.
Chapter 4. After completion of this chapter, you should be able to: Explain “what is the Internet? And how we connect to the Internet using an ISP. Explain.
NORDUnet NORDUnet The Fibre Generation Lars Fischer CTO NORDUnet.
Fermi National Accelerator Laboratory, U.S.A. Brookhaven National Laboratory, U.S.A, Karlsruhe Institute of Technology, Germany CHEP 2012, New York, U.S.A.
1 ESnet On-demand Secure Circuits and Advance Reservation System (OSCARS) Chin Guok Network Engineering Group ESCC July Energy Sciences Network.
OSCARS Overview Path Computation Topology Reachability Contraints Scheduling AAA Availability Provisioning Signalling Security Resiliency/Redundancy OSCARS.
1 Services to the US Tier-1 Sites LHCOPN April 4th, 2006 Joe Metzger ESnet Engineering Group Lawrence Berkeley National Laboratory.
Department of Electronic Engineering City University of Hong Kong EE3900 Computer Networks Introduction Slide 1 A Communications Model Source: generates.
100G R&D at Fermilab Gabriele Garzoglio (for the High Throughput Data Program team) Grid and Cloud Computing Department Computing Sector, Fermilab Overview.
HOPI Update Rick Summerhill Director Network Research, Architecture, and Technologies Jerry Sobieski MAX GigaPoP and TSC Program Manager Mark Johnson MCNC.
Nlr.net © 2004 National LambdaRail, Inc 1 NLR Update Dave Reese Joint Techs February 2007.
1 Second ATLAS-South Caucasus Software / Computing Workshop & Tutorial October 24, 2012 Georgian Technical University PhD Zaza Tsiramua Head of computer.
The University of Bolton School of Games Computing & Creative Technologies LCT2516 Network Architecture CCNA Exploration LAN Switching and Wireless Chapter.
1 ESnet Update Winter 2008 Joint Techs Workshop Joe Burrescia ESnet General Manager January 21, 2008 Energy Sciences Network Lawrence Berkeley National.
William Stallings Data and Computer Communications 7 th Edition Chapter 1 Data Communications and Networks Overview.
1 ESnet Update Joint Techs Meeting Minneapolis, MN Joe Burrescia ESnet General Manager 2/12/2007.
1 Nuclear Physics Network Requirements Workshop Washington, DC Eli Dart, Network Engineer ESnet Network Engineering Group May 6, 2008 Energy Sciences Network.
Thoughts on Future LHCOPN Some ideas Artur Barczyk, Vancouver, 31/08/09.
1 ESnet Planning for the LHC T0-T1 Networking William E. Johnston ESnet Manager and Senior Scientist Lawrence Berkeley National Laboratory.
A Framework for Internetworking Heterogeneous High-Performance Networks via GMPLS and Web Services Xi Yang, Tom Lehman Information Sciences Institute (ISI)
LAN Switching and Wireless – Chapter 1 Vilina Hutter, Instructor
DataTAG Research and Technological Development for a Transatlantic Grid Abstract Several major international Grid development projects are underway at.
Copyright 2004 National LambdaRail, Inc N ational L ambda R ail Update 9/28/2004 Debbie Montano Director, Development & Operations
Scientific Networking: The Cause of and Solution to All Problems April 14 th Workshop on High Performance Applications of Cloud and Grid Tools Jason.
Connect communicate collaborate GÉANT3 Services Connectivity and Monitoring Services by and for NRENs Ann Harding, SWITCH TNC 2010.
Chapter 7 Backbone Network. Announcements and Outline Announcements Outline Backbone Network Components  Switches, Routers, Gateways Backbone Network.
Office of Science U.S. Department of Energy ESCC Meeting July 21-23, 2004 Network Research Program Update Thomas D. Ndousse Program Manager Mathematical,
1 Network Measurement Summary ESCC, Feb Joe Metzger ESnet Engineering Group Lawrence Berkeley National Laboratory.
William Stallings Data and Computer Communications
A User Driven Dynamic Circuit Network Implementation Evangelos Chaniotakis Network Engineering Group DANMS 2008 November Energy Sciences Network.
Advanced Networks: The Past and the Future – The Internet2 Perspective APAN 7 July 2004, Cairns, Australia Douglas Van Houweling, President & CEO Internet2.
OSCARS Roadmap Chin Guok Feb 6, 2009 Energy Sciences Network Lawrence Berkeley National Laboratory Networking for the Future of.
Lawrence H. Landweber National Science Foundation SC2003 November 20, 2003
TeraPaths: A QoS Enabled Collaborative Data Sharing Infrastructure for Petascale Computing Research The TeraPaths Project Team Usatlas Tier 2 workshop.
Cyberinfrastructure and Internet2 Eric Boyd Deputy Technology Officer Internet2.
An Architectural Approach to Managing Data in Transit Micah Beck Director & Associate Professor Logistical Computing and Internetworking Lab Computer Science.
1 (Brief) Introductory Remarks On Behalf of the U.S. Department of Energy ESnet Site Coordinating Committee (ESCC) W.Scott Bradley ESCC Chairman
1 ESnet4 IP Network and Science Data Network Configuration and Roll Out Schedule Projected Schedule as of Sept., 2006 For more information contact William.
Strawman LHCONE Point to Point Experiment Plan LHCONE meeting Paris, June 17-18, 2013.
Internet2. Yesterday’s Internet  Thousands of users  Remote login, file transfer  Applications capitalize on underlying technology.
Office of Science U.S. Department of Energy High-Performance Network Research Program at DOE/Office of Science 2005 DOE Annual PI Meeting Brookhaven National.
ESnet’s Use of OpenFlow To Facilitate Science Data Mobility Chin Guok Inder Monga, and Eric Pouyoul OGF 36 OpenFlow Workshop Chicago, Il Oct 8, 2012.
Run - II Networks Run-II Computing Review 9/13/04 Phil DeMar Networks Section Head.
100GE Upgrades at FNAL Phil DeMar; Andrey Bobyshev CHEP 2015 April 14, 2015.
1 Deploying Measurement Systems in ESnet Joint Techs, Feb Joseph Metzger ESnet Engineering Group Lawrence Berkeley National Laboratory.
1 ESnet4 IP Network and Science Data Network Configuration and Roll Out Schedule Projected Schedule as of Sept., 2006 For more information contact William.
1 Network Measurement Challenges LHC E2E Network Research Meeting October 25 th 2006 Joe Metzger Version 1.1.
J. Bunn, D. Nae, H. Newman, S. Ravot, X. Su, Y. Xia California Institute of Technology US LHCNet LHCNet WG September 12 th 2006.
July 19, 2005-LHC GDB T0/T1 Networking L. Pinsky--ALICE-USA1 ALICE-USA T0/T1 Networking Plans Larry Pinsky—University of Houston For ALICE-USA.
Page 1 Page 1 Dynamic Provisioning and Survivability in Hybrid Circuit/Packet Optical Networks DoE New Projects Kick-Off Meeting Chicago, Sept
The Internet2 Network and LHC Rick Summerhill Director Network Research, Architecture, and Technologies Internet2 LHC Meeting 23 October 2006 FERMI Lab,
1 ESnet Network Requirements ASCAC Networking Sub-committee Meeting April 13, 2007 Eli Dart ESnet Engineering Group Lawrence Berkeley National Laboratory.
Computing at Fermilab D. Petravick Fermilab August 16, 2007.
Fermilab T1 infrastructure
DOE Facilities - Drivers for Science: Experimental and Simulation Data
Establishing End-to-End Guaranteed Bandwidth Network Paths Across Multiple Administrative Domains The DOE-funded TeraPaths project at Brookhaven National.
Chapter 7 Backbone Network
ATLAS Tier 2 Paths Within ESnet
Wide-Area Networking at SLAC
OSCARS Roadmap Chin Guok
Presentation transcript:

1 ESnet - Connecting the USA DOE Labs to the World of Science Eli Dart, Network Engineer Network Engineering Group Chinese American Network Symposium Indianapolis, Indiana October 20, 2008 Energy Sciences Network Lawrence Berkeley National Laboratory Networking for the Future of Science

2 Overview ESnet and the DOE Office of Science need for high performance networks ESnet4 architecture The network as a tool for science – performance Enabling Chinese-American science collaborations

3 DOE Office of Science and ESnet – the ESnet Mission ESnet’s primary mission is to enable the large-scale science that is the mission of the Office of Science (SC) and that depends on: – Sharing of massive amounts of data – Supporting thousands of collaborators world-wide – Distributed data processing – Distributed data management – Distributed simulation, visualization, and computational steering – Collaboration with the US and International Research and Education community ESnet provides network and collaboration services to Office of Science laboratories and many other DOE programs in order to accomplish its mission ESnet is the sole provider of high-speed connectivity to most DOE national laboratories

4 The “New Era” of Scientific Data Modern science is completely dependent on high-speed networking – As the instruments of science get larger and more sophisticated, the cost goes up to the point where only a very few are built (e.g. one LHC, one ITER, one James Webb Space Telescope, etc.) The volume of data generated by these instruments is going up exponentially –These instruments are mostly based on solid state sensors and so follow the same Moore’s Law as do computer CPUs, though the technology refresh cycle for instruments is years rather than 1.5 years for CPUs –The data volume is at the point where modern computing and storage technology are at their very limits trying to manage the data It takes world-wide collaborations of large numbers of scientists to conduct the science and analyze the data from a single instrument, and so the data from the instrument must be distributed all over the world – The volume of data generated by such instruments has reached the level of many petabytes/year – the point where dedicated 10 – 100 Gb/s networks that span the country and internationally are required to distribute the data

5 Networks for The “New Era” of Scientific Data Designing and building networks and providing suitable network services to support science data movement has pushed R&E networks to the forefront of network technology: There are currently no commercial networks that handle the size of the individual data flows generated by modern science – The aggregate of small flows in commercial networks is, of course, much larger – but not by as much as one might think – the Google networks only transport about 1000x the amount of data the ESnet transports What do the modern systems of science look like? – They are highly distributed and bandwidth intensive

LHC will be the largest scientific experiment and generate the most data that the scientific community has ever tried to manage. The data management model involves a world-wide collection of data centers that store, manage, and analyze the data and that are integrated through network connections with typical speeds in the 10+ Gbps range. closely coordinated and interdependent distributed systems that must have predictable intercommunicatio n for effective functioning CMS is one of two major experiments – each generates comparable amounts of data

7 The “new era” of science data will likely tax network technology Individual Labs now fill 10G links – Fermilab (an LHC Tier 1 Data Center) has 5 X 10Gb/s links to ESnet hubs in Chicago and can easily fill one or more of them for sustained periods of time The “casual” increases in overall network capacity are less likely to easily meet future needs historical estimated Data courtesy of Harvey Newman, Caltech, and Richard Mount, SLAC 1 Petabyte 1 Exabyte Experiment Generated Data, Bytes

8 Planning the Future Network 1) Data characteristics of instruments and facilities – What data will be generated by instruments coming on-line over the next 5-10 years (including supercomputers)? 2) Examining the future process of science – How and where will the new data be analyzed and used – that is, how will the process of doing science change over 5-10 years? 3) Observing traffic patterns – What do the trends in network patterns predict for future network needs?

Motivation for Overall Capacity: ESnet Traffic has Increased by 10X Every 47 Months, on Average, Since 1990 Terabytes / month Log Plot of ESnet Monthly Accepted Traffic, January, 1990 – January 2008 Oct., TBy/mo. Aug., GBy/mo. Jul., TBy/mo. 38 months 57 months 40 months Nov., TBy/mo. Apr., PBy/mo. 53 months

= the R&E source or destination of ESnet’s top 100 sites (all R&E) (the DOE Lab destination or source of each flow is not shown) The International Collaborators of DOE’s Office of Science Drive ESnet Design for International Connectivity Currently most of ESnet’s traffic (>85%) goes to and comes from outside of ESnet. This reflects the highly collaborative nature of large-scale science (which is one of the main focuses of DOE’s Office of Science).

11 ESnet total traffic, TBy/mo, January 1990 – April 2008 A small number of large data flows now dominate the network traffic – this motivates virtual circuits as a key network service LHC Tier 1 site FNAL Outbound Traffic (courtesy Phil DeMar, Fermilab)

12 Requirements from Scientific Instruments and Facilities Bandwidth – Adequate network capacity to ensure timely movement of data produced by the facilities Connectivity – Geographic reach sufficient to connect users and analysis systems to SC facilities Services – Guaranteed bandwidth, traffic isolation, end-to-end monitoring – Network service delivery architecture Service Oriented Architecture / Grid / “Systems of Systems”

13 ESnet Architecture - ESnet4 ESnet4 was built to address specific Office of Science program requirements. The result is a much more complex and much higher capacity network. ESnet to 2005: A routed IP network with sites singly attached to a national core ring Very little peering redundancy ESnet4 in 2008: The new Science Data Network (blue) is a switched network providing guaranteed bandwidth for large data movement All large science sites are dually connected on metro area rings or dually connected directly to core ring for reliability Rich topology increases the reliability of the network

14 ESnet4 – IP and SDN ESnet4 is one network with two “sides” – The IP network is a high capacity (10G) best-effort routed infrastructure Rich commodity peering infrastructure ensures global connectivity Diverse R&E peering infrastructure provides full global high-bandwidth connectivity for scientific collaboration High performance – 10G of bandwidth is adequate for many scientific collaborations Services such as native IPv6 and multicast – Science Data Network (SDN) is a virtual circuit infrastructure with bandwidth guarantees and traffic engineering capabilities Highly scalable – just add more physical circuits as demand increases Interoperable – compatible with virtual circuit infrastructures deployed by Internet2, CANARIE, GEANT and others Guaranteed bandwidth Interdomain demark is a VLAN tag – virtual circuits can be delivered to sites or other networks even when end2end reservations are not possible

15 ESnet4 Backbone Projected for December 2008 Las Vegas Seattle Sunnyvale LA San Diego Raleigh Jacksonville KC El Paso Albuq. Tulsa Clev. Boise Wash. DC SLC Port. Baton Rouge Houston Pitts. NYC Boston Atlanta Nashville ESnet IP core ESnet Science Data Network core ESnet SDN core, NLR links (existing) Lab supplied link MAN link International IP Connections Layer 1 optical nodes - eventual ESnet Points of Presence ESnet IP switch only hubs ESnet IP switch/router hubs ESnet SDN switch hubs Layer 1 optical nodes not currently in ESnet plans Lab site SDSC StarLight 20G MAN LAN (AofA) Lab site – independent dual connect. USLHC GA LLNL LANL ORNL FNAL BNL PNNL Phil Denver LHC/CERN Chicago

16 ESnet4 As Planned for 2010 Las Vegas Seattle Sunnyvale LA San Diego Raleigh Jacksonville KC El Paso Albuq. Tulsa Clev. Boise Wash. DC SLC Port. Baton Rouge Houston Pitts. NYC Boston Atlanta Nashville Layer 1 optical nodes - eventual ESnet Points of Presence ESnet IP switch only hubs ESnet IP switch/router hubs ESnet SDN switch hubs Layer 1 optical nodes not currently in ESnet plans Lab site SDSC StarLight 50G MAN LAN (AofA) Lab site – independent dual connect. USLHC GA LLNL LANL ORNL FNAL BNL PNNL Phil Denver LHC/CERN Chicago 50G 40G 30G 40G 30G 40G 50G 40G ESnet IP core ESnet Science Data Network core ESnet SDN core, NLR links (existing) Lab supplied link MAN link International IP Connections

17 Traffic Engineering on SDN – OSCARS ESnet On-demand Secure Circuits and Advance Reservation System (OSCARS) Provides edge to edge layer 2 or layer 3 virtual circuits across ESnet – Guaranteed bandwidth – Advance reservation Interoperates with many other virtual circuit infrastructures to provide end2end guaranteed bandwidth service for geographically dispersed scientific collaborations (see next slide) – Interoperability is critical, since science traffic flows cross many administrative domains in the general case

18 OSCARS Interdomain Collaborative Efforts – Terapaths Inter-domain interoperability for layer 3 virtual circuits demonstrated (3Q06) Inter-domain interoperability for layer 2 virtual circuits demonstrated at SC07 (4Q07) – LambdaStation Inter-domain interoperability for layer 2 virtual circuits demonstrated at SC07 (4Q07) – Internet2 DCN/DRAGON Inter-domain exchange of control messages demonstrated (1Q07) Integration of OSCARS and DRAGON has been successful (1Q07) – GEANT2 AutoBAHN Inter-domain reservation demonstrated at SC07 (4Q07) – DICE First draft of topology exchange schema has been formalized (in collaboration with NMWG) (2Q07), interoperability test demonstrated 3Q07 Initial implementation of reservation and signaling messages demonstrated at SC07 (4Q07) – Nortel Topology exchange demonstrated successfully 3Q07 Inter-domain interoperability for layer 2 virtual circuits demonstrated at SC07 (4Q07) – UVA Demonstrated token based authorization concept with OSCARS at SC07 (4Q07) – OGF NML-WG Actively working to combine work from NMWG and NDL Documents and UML diagram for base concepts have been drafted (2Q08) – GLIF GNI-API WG In process of designing common API and reference middleware implementation

19 The Network As A Tool For Science Science is becoming much more data intensive – Data movement is one of the great challenges facing many scientific collaborations – Getting the data to the right place is important – Scientific productivity follows data locality Therefore, a high-performance network that enables high- speed data movement as a functional service is a tool for enhancing scientific productivity and enabling new scientific paradigms Users often do not know how to use the network effectively without help – in order to be successful, networks must provide usable services to scientists

20 Some user groups need more help than others Collaborations with a small number of scientists typically do not have network tuning expertise – They rely on their local system and network admins (or grad students) – They often don’t have much data to move (typically <1TB) – Therefore, they avoid using the network for data transfer if possible Mid-sized collaborations have a lot more data, but similar expertise limitations – More scientists per collaboration, much larger data sets (10s to 100s of terabytes) – Most mid-sized collaborations still rely on local system and networking staff, or supercomputer center system and networking staff Large collaborations (HEP, NP) are big enough to have their own internal software shops – Dedicated people for networking, performance tuning, etc – Typically need much less help – Often held up (erroneously) as an example to smaller collaborations These groupings are arbitrary and approximate, but this taxonomy illustrates some points of leverage (e.g. data sources, supercomputer centers)

21 Rough user grouping by collaboration data set size Scientists per collaboration High Low High Approximate data set size Number of collaborations Low High Small data instrument science (Light Source users, Nanoscience Centers, Microscopy) Supercomputer simulations (Climate, Fusion, Bioinformatics) Large data instrument science (HEP, NP) A few large collaborations have internal software and networking groups

22 Bandwidth necessary to transfer Y bytes in X time 10PB25,020.0 Gbps3,127.5 Gbps1,042.5 Gbps148.9 Gbps34.7 Gbps 1PB2,502.0 Gbps312.7 Gbps104.2 Gbps14.9 Gbps3.5 Gbps 100TB244.3 Gbps30.5 Gbps10.2 Gbps1.5 Gbps339.4 Mbps 10TB24.4 Gbps3.1 Gbps1.0 Gbps145.4 Mbps33.9 Mbps 1TB2.4 Gbps305.4 Mbps101.8 Mbps14.5 Mbps3.4 Mbps 100GB238.6 Mbps29.8 Mbps9.9 Mbps1.4 Mbps331.4 Kbps 10GB23.9 Mbps3.0 Mbps994.2 Kbps142.0 Kbps33.1 Kbps 1GB2.4 Mbps298.3 Kbps99.4 Kbps14.2 Kbps3.3 Kbps 100MB233.0 Kbps29.1 Kbps9.7 Kbps1.4 Kbps0.3 Kbps 1H8H24H7Days30Days

23 How Can Networks Enable Science? Build the network infrastructure with throughput in mind – Cheap switches often have tiny internal buffers and cannot reliably carry high- speed flows over long distances – Fan-in is a significant problem that must be accounted for – Every device in the path matters – routers, switches, firewalls, whatever – Firewalls often cause problems that are hard to diagnose (in many cases, routers can provide equivalent security without degrading performance) Provide visibility into the network – Test and measurement hosts are critical – Many test points in the network  better problem isolation – If possible, buy routers that can count packets reliably because sometimes this is the only way to find the problem – PerfSONAR is being widely deployed for end-to-end network monitoring Work with the science community – Don’t wait for users to figure it out on their own – Work with major resources to help tune data movement services between dedicated hosts – Remember that data transfer infrastructures are systems of systems – success usually requires collaboration between LAN, WAN, Storage and Security – Provide information to help users – e.g.

24 Enabling Chinese-American Science Collaborations There are several current collaborations between US DOE laboratories and Chinese institutions – LHC/CMS requires data movement between IHEP and Fermilab – Daya Bay Neutrino Experiment requires data movement between detectors at Daya Bay and NERSC at Lawrence Berkeley National Laboratory and Brookhaven National Laboratory – EAST Tokamak – collaboration with US Fusion Energy Sciences sites such as General Atomics – Others to come, I’m sure Getting data across the Pacific can be difficult (250 millisecond round trip times are common) However, we know this can be done because others have succeeded – 1Gbps host to host network throughput between Brookhaven and KISTI in South Korea – this is expected to be 3-5 hosts wide in production – 60MB/sec per data mover from Brookhaven to CCJ in Japan (typically 6 hosts wide, for a total of 360MB/sec or 2.8Gbps) We look forward to working together to enable the scientific collaborations of our constituents!

25 Questions?