1 Networking for the Future of Science ESnet Status Update William E. Johnston ESnet Department Head and Senior Scientist This talk.

Slides:



Advertisements
Similar presentations
High Performance Computing Course Notes Grid Computing.
Advertisements

1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
1 Chin Guok ESnet Network Engineer David Robertson DSD Computer Software Engineer Lawrence Berkeley National Laboratory.
1 Visual Collaboration Jose Leary, Media Specialist.
Supporting Advanced Scientific Computing Research Basic Energy Sciences Biological and Environmental Research Fusion Energy Sciences High Energy Physics.
ESnet On-demand Secure Circuits and Advance Reservation System (OSCARS) Chin Guok Network Engineering Group Thomas Ndousse Visit February Energy.
1 ESnet Update Summer 2007 Joint Techs Workshop Joe Burrescia ESnet General Manager July 16,2007 Energy Sciences Network Lawrence Berkeley National Laboratory.
ATLAS Tier 2 Paths Within ESnet Mike O’Connor ESnet Network Engineering Group Lawrence Berkeley National Lab
1 ESnet - Connecting the USA DOE Labs to the World of Science Eli Dart, Network Engineer Network Engineering Group Chinese American Network Symposium Indianapolis,
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
October 24, 2000Milestones, Funding of USCMS S&C Matthias Kasemann1 US CMS Software and Computing Milestones and Funding Profiles Matthias Kasemann Fermilab.
1 ESnet Network Measurements ESCC Feb Joe Metzger
1 ESnet Planning for the LHC T0-T1 Networking William E. Johnston ESnet Manager and Senior Scientist Lawrence Berkeley National Laboratory.
NORDUnet NORDUnet The Fibre Generation Lars Fischer CTO NORDUnet.
Requirements Review – July 21, Requirements for CMS Patricia McBride July 21, 2005.
1 ESnet On-demand Secure Circuits and Advance Reservation System (OSCARS) Chin Guok Network Engineering Group ESCC July Energy Sciences Network.
TeraPaths TeraPaths: establishing end-to-end QoS paths - the user perspective Presented by Presented by Dimitrios Katramatos, BNL Dimitrios Katramatos,
OSCARS Overview Path Computation Topology Reachability Contraints Scheduling AAA Availability Provisioning Signalling Security Resiliency/Redundancy OSCARS.
1 Services to the US Tier-1 Sites LHCOPN April 4th, 2006 Joe Metzger ESnet Engineering Group Lawrence Berkeley National Laboratory.
HOPI Update Rick Summerhill Director Network Research, Architecture, and Technologies Jerry Sobieski MAX GigaPoP and TSC Program Manager Mark Johnson MCNC.
Nlr.net © 2004 National LambdaRail, Inc 1 NLR Update Dave Reese Joint Techs February 2007.
1 ESnet Update Winter 2008 Joint Techs Workshop Joe Burrescia ESnet General Manager January 21, 2008 Energy Sciences Network Lawrence Berkeley National.
1 ESnet Update Joint Techs Meeting Minneapolis, MN Joe Burrescia ESnet General Manager 2/12/2007.
1 ESnet Planning for the LHC T0-T1 Networking William E. Johnston ESnet Manager and Senior Scientist Lawrence Berkeley National Laboratory.
A Framework for Internetworking Heterogeneous High-Performance Networks via GMPLS and Web Services Xi Yang, Tom Lehman Information Sciences Institute (ISI)
Delivering Circuit Services to Researchers: The HOPI Testbed Rick Summerhill Director, Network Research, Architecture, and Technologies, Internet2 Joint.
DataTAG Research and Technological Development for a Transatlantic Grid Abstract Several major international Grid development projects are underway at.
Copyright 2004 National LambdaRail, Inc N ational L ambda R ail Update 9/28/2004 Debbie Montano Director, Development & Operations
ASCR/ESnet Network Requirements an Internet2 Perspective 2009 ASCR/ESnet Network Requirements Workshop April 15/16, 2009 Richard Carlson -- Internet2.
TeraPaths TeraPaths: Establishing End-to-End QoS Paths through L2 and L3 WAN Connections Presented by Presented by Dimitrios Katramatos, BNL Dimitrios.
Office of Science U.S. Department of Energy ESCC Meeting July 21-23, 2004 Network Research Program Update Thomas D. Ndousse Program Manager Mathematical,
1 Network Measurement Summary ESCC, Feb Joe Metzger ESnet Engineering Group Lawrence Berkeley National Laboratory.
Practical Distributed Authorization for GARA Andy Adamson and Olga Kornievskaia Center for Information Technology Integration University of Michigan, USA.
Connect. Communicate. Collaborate perfSONAR MDM Service for LHC OPN Loukik Kudarimoti DANTE.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Advanced Networks: The Past and the Future – The Internet2 Perspective APAN 7 July 2004, Cairns, Australia Douglas Van Houweling, President & CEO Internet2.
1 Updating the ESnet Site Coordinator Model (Presented to SLCCC, June, 2004) Joe Burrescia Mike Collins William E. Johnston DRAFT FOR COMMENT 7/19/04.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
1 ESnet Update ESnet/Internet2 Joint Techs Madison, Wisconsin July 17, 2007 Joe Burrescia ESnet General Manager Lawrence Berkeley National Laboratory.
Internet2 Joint Techs Workshop, Feb 15, 2005, Salt Lake City, Utah ESnet On-Demand Secure Circuits and Advance Reservation System (OSCARS) Chin Guok
Dynamic Circuit Network An Introduction John Vollbrecht, Internet2 May 26, 2008.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
TeraPaths: A QoS Enabled Collaborative Data Sharing Infrastructure for Petascale Computing Research The TeraPaths Project Team Usatlas Tier 2 workshop.
1 The ILC Control Work Packages. ILC Control System Work Packages GDE Oct Who We Are Collaboration loosely formed at Snowmass which included SLAC,
1 ESnet Update ESnet/Internet2 Joint Techs Albuquerque, New Mexico February 6, 2005 R. Kevin Oberman ESnet Network Engineer Lawrence Berkeley National.
Dynamic Network Services In Internet2 John Vollbrecht /Dec. 4, 2006 Fall Members Meeting.
Advanced research and education networking in the United States: the Internet2 experience Heather Boyles Director, Member and Partner Relations Internet2.
INDIANAUNIVERSITYINDIANAUNIVERSITY HOPI: Hybrid Packet and Optical Infrastructure Chris Robb and Jim Williams Indiana University 7 July 2004 Cairns, AU.
DICE: Authorizing Dynamic Networks for VOs Jeff W. Boote Senior Network Software Engineer, Internet2 Cándido Rodríguez Montes RedIRIS TNC2009 Malaga, Spain.
Michael Ernst U.S. ATLAS Tier-1 Network Status Evolution of LHC Networking – February 10,
1 ESnet4 IP Network and Science Data Network Configuration and Roll Out Schedule Projected Schedule as of Sept., 2006 For more information contact William.
-1- ESnet On-Demand Secure Circuits and Advance Reservation System (OSCARS) David Robertson Internet2 Joint Techs Workshop July 18,
Strawman LHCONE Point to Point Experiment Plan LHCONE meeting Paris, June 17-18, 2013.
LHCONE NETWORK SERVICES: GETTING SDN TO DEV-OPS IN ATLAS Shawn McKee/Univ. of Michigan LHCONE/LHCOPN Meeting, Taipei, Taiwan March 14th, 2016 March 14,
BNL Network Status and dCache/Network Integration Dantong Yu USATLAS Computing Facility Brookhaven National Lab.
Run - II Networks Run-II Computing Review 9/13/04 Phil DeMar Networks Section Head.
Supporting Advanced Scientific Computing Research Basic Energy Sciences Biological and Environmental Research Fusion Energy Sciences High Energy Physics.
Supporting Advanced Scientific Computing Research Basic Energy Sciences Biological and Environmental Research Fusion Energy Sciences High Energy Physics.
1 Deploying Measurement Systems in ESnet Joint Techs, Feb Joseph Metzger ESnet Engineering Group Lawrence Berkeley National Laboratory.
1 ESnet4 IP Network and Science Data Network Configuration and Roll Out Schedule Projected Schedule as of Sept., 2006 For more information contact William.
1 Network Measurement Challenges LHC E2E Network Research Meeting October 25 th 2006 Joe Metzger Version 1.1.
J. Bunn, D. Nae, H. Newman, S. Ravot, X. Su, Y. Xia California Institute of Technology US LHCNet LHCNet WG September 12 th 2006.
LCG Phase-2 Planning David Foster IT/CS 14 th April 2005 Thanks to Dante, ASNet and ESnet for material presented at the T0/T1 Networking meeting in Amsterdam.
Bob Jones EGEE Technical Director
Networking for the Future of Science
DOE Facilities - Drivers for Science: Experimental and Simulation Data
ESnet Network Engineer Lawrence Berkeley National Laboratory
Energy Sciences Network Enabling Virtual Science June 9, 2009
ATLAS Tier 2 Paths Within ESnet
Fall 2006 Internet2 Member Meeting
Presentation transcript:

1 Networking for the Future of Science ESnet Status Update William E. Johnston ESnet Department Head and Senior Scientist This talk is available at Energy Sciences Network Lawrence Berkeley National Laboratory Networking for the Future of Science ESCC January 23, 2008 (Aloha!)

2 DOE Office of Science and ESnet – the ESnet Mission ESnet’s primary mission is to enable the large- scale science that is the mission of the Office of Science (SC) and that depends on: – Sharing of massive amounts of data – Supporting thousands of collaborators world-wide – Distributed data processing – Distributed data management – Distributed simulation, visualization, and computational steering – Collaboration with the US and International Research and Education community ESnet provides network and collaboration services to Office of Science laboratories and many other DOE programs in order to accomplish its mission

3 ESnet Stakeholders and their Role in ESnet DOE Office of Science Oversight (“SC”) of ESnet – The SC provides high-level oversight through the budgeting process – Near term input is provided by weekly teleconferences between SC and ESnet – Indirect long term input is through the process of ESnet observing and projecting network utilization of its large- scale users – Direct long term input is through the SC Program Offices Requirements Workshops (more later) SC Labs input to ESnet – Short term input through many daily (mostly) interactions – Long term input through ESCC

4 ESnet Stakeholders and the Role in ESnet SC science collaborators input – Through numerous meeting, primarily with the networks that serve the science collaborators

5 Talk Outline I. Building ESnet4 Ia. Network Infrastructure Ib. Network Services Ic. Network Monitoring II. Requirements III. Science Collaboration Services IIIa. Federated Trust IIIb. Audio, Video, Data Teleconferencing

6 TWC SNLL YUCCA MT BECHTEL-NV PNNL LIGO INEEL LANL SNLA Allied Signal PANTEX ARM KCP NOAA OSTI ORAU SRS JLAB PPPL Lab DC Offices MIT ANL BNL FNAL AMES NREL LLNL GA DOE-ALB OSC GTN NNSA International (high speed) 10 Gb/s SDN core 10G/s IP core 2.5 Gb/s IP core MAN rings (≥ 10 G/s) Lab supplied links OC12 ATM (622 Mb/s) OC12 / GigEthernet OC3 (155 Mb/s) 45 Mb/s and less NNSA Sponsored (12) Joint Sponsored (3) Other Sponsored (NSF LIGO, NOAA) Laboratory Sponsored (6) 42 end user sites SINet (Japan) Russia (BINP) CA*net4 France GLORIAD (Russia, China) Korea (Kreonet2 Japan (SINet) Australia (AARNet) Canada (CA*net4 Taiwan (TANet2) Singaren ESnet IP core: Packet over SONET Optical Ring and Hubs ELP DC commercial peering points MAE-E PAIX-PA Equinix, etc. PNWGPoP/ PAcificWave Building ESnet4 - Starting Point ESnet core hubs IP Abilene high-speed peering points with Internet2/Abilene Abilene CERN (USLHCnet DOE+CERN funded) GÉANT - France, Germany, Italy, UK, etc NYC Starlight SNV Abilene JGI LBNL SLAC NERSC SNV SDN SDSC Equinix SNV ALB ORNL CHI MREN Netherlands StarTap Taiwan (TANet2, ASCC) NASA Ames AU SEA CHI-SL MAN LAN Abilene Specific R&E network peers Other R&E peering points UNM MAXGPoP AMPATH (S. America) ESnet Science Data Network (SDN) core R&E networks Office Of Science Sponsored (22) ATL NSF/IRNC funded Equinix Ia. ESnet 3 with Sites and Peers (Early 2007)

7 ESnet 3 Backbone as of January 1, 2007 Sunnyvale 10 Gb/s SDN core (NLR) 10/2.5 Gb/s IP core (QWEST) MAN rings (≥ 10 G/s) Lab supplied links Seattle San Diego Albuquerque El Paso Chicago New York City Washington DC Atlanta Future ESnet Hub ESnet Hub

8 ESnet 4 Backbone as of April 15, 2007 Clev. Boston 10 Gb/s SDN core (NLR) 10/2.5 Gb/s IP core (QWEST) 10 Gb/s IP core (Level3) 10 Gb/s SDN core (Level3) MAN rings (≥ 10 G/s) Lab supplied links Sunnyvale Seattle San Diego Albuquerque El Paso Chicago New York City Washington DC Atlanta Future ESnet Hub ESnet Hub

9 ESnet 4 Backbone as of May 15, 2007 SNV Clev. Boston 10 Gb/s SDN core (NLR) 10/2.5 Gb/s IP core (QWEST) 10 Gb/s IP core (Level3) 10 Gb/s SDN core (Level3) MAN rings (≥ 10 G/s) Lab supplied links Clev. Boston Sunnyvale Seattle San Diego Albuquerque El Paso Chicago New York City Washington DC Atlanta Future ESnet Hub ESnet Hub

10 ESnet 4 Backbone as of June 20, 2007 Clev. Boston Houston Kansas City 10 Gb/s SDN core (NLR) 10/2.5 Gb/s IP core (QWEST) 10 Gb/s IP core (Level3) 10 Gb/s SDN core (Level3) MAN rings (≥ 10 G/s) Lab supplied links Boston Sunnyvale Seattle San Diego Albuquerque El Paso Chicago New York City Washington DC Atlanta Denver Future ESnet Hub ESnet Hub

11 ESnet 4 Backbone August 1, 2007 (Last JT meeting at FNAL) Clev. Boston Houston Los Angeles 10 Gb/s SDN core (NLR) 10/2.5 Gb/s IP core (QWEST) 10 Gb/s IP core (Level3) 10 Gb/s SDN core (Level3) MAN rings (≥ 10 G/s) Lab supplied links Clev. Houston Kansas City Boston Sunnyvale Seattle San Diego Albuquerque El Paso Chicago New York City Washington DC Atlanta Denver Future ESnet Hub ESnet Hub

12 ESnet 4 Backbone September 30, 2007 Clev. Boston Houston Boise Los Angeles 10 Gb/s SDN core (NLR) 10/2.5 Gb/s IP core (QWEST) 10 Gb/s IP core (Level3) 10 Gb/s SDN core (Level3) MAN rings (≥ 10 G/s) Lab supplied links Clev. Houston Kansas City Boston Sunnyvale Seattle San Diego Albuquerque El Paso Chicago New York City Washington DC Atlanta Denver Future ESnet Hub ESnet Hub

13 ESnet 4 Backbone December 2007 Clev. Boston Houston Boise Los Angeles 10 Gb/s SDN core (NLR) 2.5 Gb/s IP Tail (QWEST) 10 Gb/s IP core (Level3) 10 Gb/s SDN core (Level3) MAN rings (≥ 10 G/s) Lab supplied links Clev. Houston Kansas City Boston Sunnyvale Seattle San Diego Albuquerque El Paso New York City Washington DC Atlanta Denver Nashville Future ESnet Hub ESnet Hub Chicago

14 DC ESnet 4 Backbone Projected for December, 2008 Houston 10 Gb/s SDN core (NLR) 10/2.5 Gb/s IP core (QWEST) 10 Gb/s IP core (Level3) 10 Gb/s SDN core (Level3) MAN rings (≥ 10 G/s) Lab supplied links Clev. Houston Kansas City Boston Sunnyvale Seattle San Diego Albuquerque El Paso Chicago New York City Washington DC Atlanta Denver Los Angeles Nashville Future ESnet Hub ESnet Hub X2

LVK SNLL YUCCA MT BECHTEL-NV PNNL LIGO LANL SNLA Allied Signal PANTEX ARM KCP NOAA OSTI ORAU SRS JLAB PPPL Lab DC Offices MIT/ PSFC BNL AMES NREL LLNL GA DOE-ALB DOE GTN NNSA NNSA Sponsored (13+) Joint Sponsored (3) Other Sponsored (NSF LIGO, NOAA) Laboratory Sponsored (6) ~45 end user sites SINet (Japan) Russia (BINP) CA*net4 France GLORIAD (Russia, China) Korea (Kreonet2 Japan (SINet) Australia (AARNet) Canada (CA*net4 Taiwan (TANet2) Singaren ELPA WASH commercial peering points PAIX-PA Equinix, etc. ESnet Provides Global High-Speed Internet Connectivity for DOE Facilities and Collaborators (12/2007) ESnet core hubs CERN (USLHCnet: DOE+CERN funded) GÉANT - France, Germany, Italy, UK, etc NEWY SUNN Abilene JGI LBNL SLAC NERSC SNV1 SDSC Equinix ALBU ORNL CHIC MREN StarTap Taiwan (TANet2, ASCC) NASA Ames AU SEA CHI-SL Specific R&E network peers UNM MAXGPoP NLR AMPATH (S. America) R&E networks Office Of Science Sponsored (22) ATLA NSF/IRNC funded IARC PacWave KAREN/REANNZ ODN Japan Telecom America NLR-Packetnet Abilene/I2 KAREN / REANNZ Internet2 SINGAREN ODN Japan Telecom America NETL ANL FNAL Starlight USLHCNet NLR International (1-10 Gb/s) 10 Gb/s SDN core (I2, NLR) 10Gb/s IP core MAN rings (≥ 10 Gb/s) Lab supplied links OC12 / GigEthernet OC3 (155 Mb/s) 45 Mb/s and less Salt Lake PacWave Abilene Equinix DENV DOE SUNN NASH Geography is only representational Internet2 NYSERNet MAN LAN Other R&E peering points USHLCNet to GÉANT INL

Core networks Gbps by (10Gb/s circuits), Gbps by (100 Gb/s circuits) ESnet4 End-Game Cleveland Europe (GEANT) Asia-Pacific New York Chicago Washington DC Atlanta CERN (30+ Gbps) Seattle Albuquerque Australia San Diego LA Denver South America (AMPATH) South America (AMPATH) Canada (CANARIE) CERN (30+ Gbps) Canada (CANARIE) Asia- Pacific Asia Pacific GLORIAD (Russia and China) Boise Houston Jacksonville Tulsa Boston Science Data Network Core IP Core Kansas City Australia Core network fiber path is ~ 14,000 miles / 24,000 km 1625 miles / 2545 km 2700 miles / 4300 km Sunnyvale Production IP core (10Gbps) SDN core ( Gbps) MANs (20-60 Gbps) or backbone loops for site access International connections IP core hubs Primary DOE Labs SDN hubs High speed cross-connects with Ineternet2/Abilene Possible hubs USLHCNet

17 A Tail of Two ESnet4 Hubs MX960 Switch T320 Router 6509 Switch T320 Routers Sunnyvale Ca Hub Chicago Hub ESnet’s SDN backbone is implemented with Layer2 switches; Cisco 6509s and Juniper MX960s - Each present their own unique challenges.

18 ESnet 4 Factoids as of January 21, 2008 ESnet4 installation to date: – 32 new 10Gb/s backbone circuits Over 3 times the number from last JT meeting – 20,284 10Gb/s backbone Route Miles More than doubled from last JT meeting – 10 new hubs Since last meeting –Seattle –Sunnyvale –Nashville – 7 new routers 4 new switches – Chicago MAN now connected to Level3 POP 2 x 10GE to ANL 2 x 10GE to FNAL 3 x 10GE to Starlight

19 ESnet Traffic Continues to Exceed 2 Petabytes/Month 2.7 PBytes in July PBytes in April 2006 ESnet traffic historically has increased 10x every 47 months Overall traffic tracks the very large science use of the network

FNAL Outbound Traffic When A Few Large Data Sources/Sinks Dominate Traffic it is Not Surprising that Overall Network Usage Follows the Patterns of the Very Large Users - This Trend Will Reverse in the Next Few Weeks as the Next Round of LHC Data Challenges Kicks Off

FNAL Traffic is Representative of all CMS Traffic Accumulated data (Terabytes) received by CMS Data Centers (“tier1” sites) and many analysis centers (“tier2” sites) during the past 12 months (15 petabytes of data) [LHC/CMS]

22 ESnet Continues to be Highly Reliable; Even During the Transition “5 nines” (>99.995%) “3 nines” (>99.5%) “4 nines” (>99.95%) Dually connected sites Note: These availability measures are only for ESnet infrastructure, they do not include site-related problems. Some sites, e.g. PNNL and LANL, provide circuits from the site to an ESnet hub, and therefore the ESnet-site demarc is at the ESnet hub (there is no ESnet equipment at the site. In this case, circuit outages between the ESnet equipment and the site are considered site issues and are not included in the ESnet availability metric.

Network Services for Large-Scale Science Large-scale science uses distributed system in order to: – Couple existing pockets of code, data, and expertise into a “system of systems” – Break up the task of massive data analysis into elements that are physically located where the data, compute, and storage resources are located - these elements are combined into a system using a “Service Oriented Architecture” approach Such systems – are data intensive and high-performance, typically moving terabytes a day for months at a time – are high duty-cycle, operating most of the day for months at a time in order to meet the requirements for data movement – are widely distributed – typically spread over continental or inter-continental distances – depend on network performance and availability, but these characteristics cannot be taken for granted, even in well run networks, when the multi-domain network path is considered The system elements must be able to get guarantees from the network that there is adequate bandwidth to accomplish the task at hand The systems must be able to get information from the network that allows graceful failure and auto-recovery and adaptation to unexpected network conditions that are short of outright failure See, e.g., [ICFA SCIC] Ib.

24 Enabling Large-Scale Science These requirements are generally true for systems with widely distributed components to be reliable and consistent in performing the sustained, complex tasks of large-scale science  Networks must provide communication capability as a service that can participate in SOA configurable schedulable predictable reliable informative and the network and its services must be scalable and geographically comprehensive

Networks Must Provide Communication Capability that is Service-Oriented Configurable – Must be able to provide multiple, specific “paths” (specified by the user as end points) with specific characteristics Schedulable – Premium service such as guaranteed bandwidth will be a scarce resource that is not always freely available, therefore time slots obtained through a resource allocation process must be schedulable Predictable – A committed time slot should be provided by a network service that is not brittle - reroute in the face of network failures is important Reliable – Reroutes should be largely transparent to the user Informative – When users do system planning they should be able to see average path characteristics, including capacity – When things do go wrong, the network should report back to the user in ways that are meaningful to the user so that informed decisions can about alternative approaches Scalable – The underlying network should be able to manage its resources to provide the appearance of scalability to the user Geographically comprehensive – The R&E network community must act in a coordinated fashion to provide this environment end-to-end

26 The ESnet Approach Provide configurability, schedulability, predictability, and reliability with a flexible virtual circuit service - OSCARS – User* specifies end points, bandwidth, and schedule – OSCARS can do fast reroute of the underlying MPLS paths Provide useful, comprehensive, and meaningful information on the state of the paths, or potential paths, to the user – perfSONAR, and associated tools, provide real time information in a form that is useful to the user (via appropriate network abstractions) and that is delivered through standard interfaces that can be incorporated in to SOA type applications – Techniques need to be developed to monitor virtual circuits based on the approaches of the various R&E nets - e.g. MPLS in ESnet, VLANs, TDM/grooming devices (e.g. Ciena Core Directors), etc., and then integrate this into a perfSONAR framework * User = human or system component (process)

27 The ESnet Approach Scalability will be provided by new network services that, e.g., provide dynamic wave allocation at the optical layer of the network – Currently an R&D project Geographic ubiquity of the services can only be accomplished through active collaborations in the global R&E network community so that all sites of interest to the science community can provide compatible services for forming end-to-end virtual circuits – Active and productive collaborations exist among numerous R&E networks: ESnet, Internet2, CANARIE, DANTE/GÉANT, some European NRENs, some US regionals, etc.

28 OSCARS Overview Path Computation Topology Reachability Constraints Scheduling AAA Availability Provisioning Signaling Security Resiliency/Redundancy OSCARS Guaranteed Bandwidth Virtual Circuit Services On-demand Secure Circuits and Advance Reservation System

29 OSCARS Status Update ESnet Centric Deployment – Prototype layer 3 (IP) guaranteed bandwidth virtual circuit service deployed in ESnet (1Q05) – Prototype layer 2 (Ethernet VLAN) virtual circuit service deployed in ESnet (3Q07) Inter-Domain Collaborative Efforts – Terapaths (BNL) Inter-domain interoperability for layer 3 virtual circuits demonstrated (3Q06) Inter-domain interoperability for layer 2 virtual circuits demonstrated at SC07 (4Q07) – LambdaStation (FNAL) Inter-domain interoperability for layer 2 virtual circuits demonstrated at SC07 (4Q07) – HOPI/DRAGON Inter-domain exchange of control messages demonstrated (1Q07) Integration of OSCARS and DRAGON has been successful (1Q07) – DICE First draft of topology exchange schema has been formalized (in collaboration with NMWG) (2Q07), interoperability test demonstrated 3Q07 Initial implementation of reservation and signaling messages demonstrated at SC07 (4Q07) – UVA Integration of Token based authorization in OSCARS under testing – Nortel Topology exchange demonstrated successfully 3Q07 Inter-domain interoperability for layer 2 virtual circuits demonstrated at SC07 (4Q07)

30 Network Measurement Update Deploy network test platforms at all hubs and major sites – About 1/3 of the 10GE bandwidth test platforms & 1/2 of the latency test platforms for ESnet 4 have been deployed. – 10GE test systems are being used extensively for acceptance testing and debugging – Structured & ad-hoc external testing capabilities have not been enabled yet. – Clocking issues at a couple POPS are not resolved. Work is progressing on revamping the ESnet statistics collection, management & publication systems – ESxSNMP & TSDB & PerfSONAR Measurement Archive (MA) – PerfSONAR TS & OSCARS Topology DB – NetInfo being restructured to be PerfSONAR based Ic.

31 PerfSONAR provides a service element oriented approach to monitoring that has the potential to integrate into SOA – See Joe Metzger’s talk Network Measurement Update

32 SC Program Network Requirements Workshops The Workshops are part of DOE’s governance of ESnet – The ASCR Program Office owns the requirements workshops, not ESnet – The Workshops replaced the ESnet Steering Committee – The workshops are fully controlled by DOE....all that ESnet does is to support DOE in putting on the workshops The content and logistics of the workshops is determined by an SC Program Manager from the Program Office that is the subject of the each workshop –SC Program Office sets the timing, location (almost always Washington so that DOE Program Office people can attend), and participants II.

33 Network Requirements Workshops Collect requirements from two DOE/SC program offices per year DOE/SC Program Office workshops held in 2007 – Basic Energy Sciences (BES) – June 2007 – Biological and Environmental Research (BER) – July 2007 Workshops to be held in 2008 – Fusion Energy Sciences (FES) – Coming in March 2008 – Nuclear Physics (NP) – TBD 2008 Future workshops – HEP and ASCR in 2009 – BES and BER in 2010 – And so on…

34 Network Requirements Workshops - Findings Virtual circuit services (traffic isolation, bandwidth guarantees, etc) continue to be requested by scientists – OSCARS service directly addresses these needs Successfully deployed in early production today ESnet will continue to develop and deploy OSCARS Some user communities have significant difficulties using the network for bulk data transfer – fasterdata.es.net – web site devoted to bulk data transfer, host tuning, etc. established – NERSC and ORNL have made significant progress on improving data transfer performance between supercomputer centers

35 Network Requirements Workshops - Findings Some data rate requirements are unknown at this time – Drivers are instrument upgrades that are subject to review, qualification and other decisions that are 6-12 months away – These will be revisited in the appropriate timeframe

36 BES Workshop Bandwidth Matrix as of June 2007 ProjectPrimary Site Primary Partner Sites Primary ESnet 2007 Bandwidth 2012 Bandwidth ALSLBNLDistributedSunnyvale3 Gbps10 Gbps APS, CNM, SAMM, ARM ANLFNAL, BNL, UCLA, and CERN Chicago10 Gbps20 Gbps Nano Center BNLDistributedNYC1 Gbps5 Gbps CRFSNL/CANERSC, ORNLSunnyvale5 Gbps10 Gbps Molecular Foundry LBNLDistributedSunnyvale1 Gbps5 Gbps NCEMLBNLDistributedSunnyvale1 Gbps5 Gbps LCLFSLACDistributedSunnyvale2 Gbps4 Gbps NSLSBNLDistributedNYC1 Gbps5 Gbps SNSORNLLANL, NIST, ANL, U. Indiana Nashville1 Gbps10 Gbps Total25 Gbps74 Gbps

37 BER Workshop Bandwidth Matrix as of Dec 2007 ProjectPrimary Site Primary Partner Sites Primary ESnet 2007 Bandwidth 2012 Bandwidth ARMBNL, ORNL, PNNL NOAA, NASA, ECMWF (Europe), Climate Science NYC, Nashville, Seattle 1 Gbps5 Gbps Bioinformati cs PNNLDistributedSeattle.5 Gbps3 Gbps EMSLPNNLDistributedSeattle10 Gbps50 Gbps ClimateLLNL, NCAR, ORNL NCAR, LANL, NERSC, LLNL, International Sunnyvale, Denver, Nashville 1 Gbps5 Gbps JGI NERSCSunnyvale1 Gbps5 Gbps Total13.5 Gbps68 Gbps

38 ESnet Site Network Requirements Surveys Surveys given to ESnet sites through ESCC Many sites responded, many did not Survey was lacking in several key areas – Did not provide sufficient focus to enable consistent data collection – Sites vary widely in network usage, size, science/business, etc… very difficult to make one survey fit all – In many cases, data provided not quantitative enough (this appears to be primarily due to the way in which the questions were asked) Surveys were successful in some key ways – It is clear that there are many significant projects/programs that cannot be captured in the DOE/SC Program Office workshops – DP, industry, other non-SC projects – Need better process to capture this information New model for site requirements collection needs to be developed

39 Federated Trust Services Remote, multi-institutional, identity authentication is critical for distributed, collaborative science in order to permit sharing widely distributed computing and data resources, and other Grid services Public Key Infrastructure (PKI) is used to formalize the existing web of trust within science collaborations and to extend that trust into cyber space – The function, form, and policy of the ESnet trust services are driven entirely by the requirements of the science community and by direct input from the science community – International scope trust agreements that encompass many organizations are crucial for large-scale collaborations The service (and community) has matured to the point where it is revisiting old practices and updating and formalizing them IIIa.

40 DOEGrids CA Audit “Request” by EUGridPMA – EUGridPMA is auditing all “old” CAs OGF Audit Framework – Developed from WebTrust for CAs & al Partial review of NIST Audit Day 11 Dec 2007 – Auditors: Robert Cowles (SLAC) Dan Peterson (ESnet) Mary Thompson (ex- LBL) John Volmer (ANL) Scott Rea (HEBCA*)(obsrv) * Higher Education Bridge Certification Authority The goal of the Higher Education Bridge Certification Authority (HEBCA) is to facilitate trusted electronic communications within and between institutions of higher education as well as with federal and state governments.

41 DOEGrids CA Audit – Results Final report in progress Generally good – many documentation errors need to be addressed EUGridPMA is satisfied  EUGridPMA has agreed to recognize US research science ID verification as acceptable for initial issuance of certificate – This is a BIG step forward The ESnet CA projects have begun a year-long effort to converge security documents and controls with NIST

42 DOEGrids CA Audit – Issues ID verification – no face to face/ID doc check – We have collectively agreed to drop this issue – US science culture is what it is, and has a method for verifying identity Renewals – we must address the need to re-verify our subscribers after 5 years Auditors recommend we update the format of our Certification Practices Statement (for interoperability and understandability) Continue efforts to improve reliability & disaster recovery We need to update our certificate formats again (minor errors) There are many undocumented or incompletely documented security practices (a problem both in the CPS and NIST )

43 DOEGrids CA (one of several CAs) Usage Statistics User Certificates6549Total No. of Revoked Certificates1776 Host & Service Certificates14545Total No. of Expired Certificates11797 Total No. of Requests25470Total No. of Certificates Issued21095 Total No. of Active Certificates7547 ESnet SSL Server CA Certificates49 FusionGRID CA certificates113 * Report as of Jan 17, 2008

44 DOEGrids CA (Active Certificates) Usage Statistics * Report as of Jan 17, 2008 US, LHC ATLAS project adopts ESnet CA service

45 DOEGrids CA Usage - Virtual Organization Breakdown * DOE-NSF collab. & Auto renewals ** OSG Includes (BNL, CDF, CIGI, CMS, CompBioGrid, DES, DOSAR, DZero, Engage, Fermilab, fMRI, GADU, geant4, GLOW, GPN, GRASE, GridEx, GROW, GUGrid, i2u2, ILC, iVDGL, JLAB, LIGO, mariachi, MIS, nanoHUB, NWICG, NYGrid, OSG, OSGEDU, SBGrid, SDSS, SLAC, STAR & USATLAS)

46 DOEGrids CA Usage - Virtual Organization Breakdown * DOE-NSF collab. * Feb., 2005

47 DOEGrids Disaster Recovery Recent upgrades and electrical incidents showed some unexpected vulnerabilities Remedies: – Update ESnet battery backup control to protect ESnet PKI servers better – “Clone” CAs and distribute copies around the country A lot of engineering A lot of security work and risk assessment A lot of politics – Clone and distribute CRL distribution machines

48 Policy Management Authority DOEGrids PMA needs re-vitalization – Audit finding – Will transition to (t)wiki format web site – Unclear how to re-energize ESnet owns the IGTF domains, and now the TAGPMA.org domain – 2 of the important domains in research science Grids TAGPMA.org – CANARIE needed to give up ownership – Currently finishing the transfer – Developing Twiki for PMA use IGTF.NET – Acquired in 2007 – Will replace “gridpma.org” as the home domain for IGTF – Will focus on the wiki foundation used in TAGPMA, when it stabilizes

49 Possible Use of Grid Certs. For Wiki Access Experimenting with Wiki and client cert authentication – Motivation – no manual registration, large community, make PKI more useful Twiki – popular in science; upload of documents; many modules; some modest access control – Hasn’t behaved well with client certs; the interaction of Apache Twiki TLS client is very difficult Some alternatives: – GridSite (but uses Media Wiki) – OpenID

50 Possible Use of Federation for ECS Authentication The Federated Trust / DOEGrids approach to managing authentication has successfully scaled to about 8000 users – This is possible because of the Registration Agent approach that puts initial authorization and certificate issuance in the hands of community representatives rahter than ESnet – Such an approach, in theory, could also work for ECS authentication and maybe first-level problems (e.g. “I have forgotten my password”) Upcoming ECS technology refresh includes authentication & authorization improvements.

51 Possible Use of Federation for ECS Authentication Exploring: – Full integration with DOEGrids – use its registration directly, and its credentials – Service Provider in federation architecture (Shibboleth, maybe openID) – Indico – this conference/room scheduler has become popular. Authentication/authorization services support needed – Some initial discussions with Tom U Chicago (Internet2) on federation approaches took place in December, more to come soon Questions to Mike Helm and Stan Kluz

52 ESnet Conferencing Service (ECS) An ESnet Science Service that provides audio, video, and data teleconferencing service to support human collaboration of DOE science – Seamless voice, video, and data teleconferencing is important for geographically dispersed scientific collaborators – Provides the central scheduling essential for global collaborations – ECS serves about 1600 DOE researchers and collaborators worldwide at 260 institutions Videoconferences - about 3500 port hours per month Audio conferencing - about 2300 port hours per month Data conferencing - about 220 port hours per month Web-based, automated registration and scheduling for all of these services IIIb.

53 ESnet Collaboration Services (ECS)

54 ECS Video Collaboration Service High Quality videoconferencing over IP and ISDN Reliable, appliance based architecture Ad-Hoc H.323 and H.320 multipoint meeting creation Web Streaming options on 3 Codian MCU’s using Quicktime or Real 3 Codian MCUs with Web Conferencing Options 120 total ports of video conferencing on each MCU (40 ports per MCU) 384k access for video conferencing systems using ISDN protocol Access to audio portion of video conferences through the Codian ISDN Gateway

55 ECS Voice and Data Collaboration 144 usable ports – Actual conference ports readily available on the system. 144 overbook ports – Number of ports reserved to allow for scheduling beyond the number of conference ports readily available on the system. 108 Floater Ports – Designated for unexpected port needs. – Floater ports can float between meetings, taking up the slack when an extra person attends a meeting that is already full and when ports that can be scheduled in advance are not available. Audio Conferencing and Data Collaboration using Cisco MeetingPlace Data Collaboration = WebEx style desktop sharing and remote viewing of content Web-based user registration Web-based scheduling of audio / data conferences notifications of conferences and conference changes 650+ users registered to schedule meetings (not including guests)

56 ECS Futures ESnet is still on-track to replicate the teleconferencing hardware currently located at LBNL in a Central US or Eastern US location – have about come to the conclusion that the ESnet hub in NYC is not the right place to site the new equipment The new equipment is intended to provide at least comparable service to the current (upgraded) ECS system – Also intended to provide some level of backup to the current system – A new Web based registration and scheduling portal may also come out of this

57 ECS Service Level ESnet Operations Center is open for service 24x7x365. A trouble ticket is opened within15 to 30 minutes and assigned to the appropriate group for investigation. Trouble ticket is closed when the problem is resolved. ECS support is provided Monday to Friday, 8AM to 5 PM Pacific Time excluding LBNL holidays – Reported problems are addressed within 1 hour from receiving a trouble ticket during ECS support period – ESnet does NOT provide a real time (during-conference) support service

58 Real Time ECS Support A number of user groups have requested “real-time” conference support (monitoring of conferences while in-session) Limited Human and Financial resources currently prohibit ESnet from: A) Making real time information available to the public on the systems status (network, ECS, etc) This information is available only on some systems to our support personnel B) 24x7x365 real-time support C) Addressing simultaneous trouble calls as in a real time support environment. This would require several people addressing multiple problems simultaneously

59 Real Time ECS Support Solution – A fee-for-service arrangement for real-time conference support – Available from TKO Video Communications, ESnet’s ECS service contractor – Service offering could provide: Testing and configuration assistance prior to your conference Creation and scheduling of your conferences on ECS Hardware Preferred port reservations on ECS video and voice systems Connection assistance and coordination with participants Endpoint troubleshooting Live phone support during conferences Seasoned staff and years of experience in the video conferencing industry ESnet community pricing

60 ECS Impact from LBNL Power Outage, January 9 th 2008 Heavy rains caused LBNL sub-station one of two 12Kv buss to fail – 50% of LBNL lost power – LBNL estimates 48 hr before power restored – ESnet lost power to data center – Backup generator for ESnet data center failed to start due to a failed starter battery – ESnet staff kept MAN Router functioning by swapping batteries in UPS. – ESnet services, ECS, PKI, etc.. were shut down to protect systems and reduce heat load in room – Internal ESnet router lost UPS power and shut down After ~25 min generator was started by “jump” starting. – ESnet site router returned to service – No A/C in data center when running on generator – Mission critical services brought back on line After ~ 2 hours house power was restored – Power reliability still questionable – LBNL strapped buss one to feed buss two After 24 hrs remaining services restored to normal operation Customer Impact – ~ 2 Hrs instability of ESnet services to customers

61 Power Outage Lessons Learned As of Jan 22, 2008 – Normal building power feed has still not been restored EPA rules restrict operation of generator in non- emergency mode. – However, monthly running of generator will resume Current critical systems list to be evaluated and priorities adjusted. Internal ESnet router relocated to bigger UPS or removed from the ESnet services critical path. ESnet staff need more flashlights!

62  Summary Transition to ESnet4 is going smoothly – New network services to support large-scale science are progressing – Measurement infrastructure is rapidly becoming widely enough deployed to be very useful New ESC hardware and service contract are working well – Plans to deploy replicate service are on-track Federated trust - PKI policy and Certification Authorities – Service continues to pick up users at a pretty steady rate – Maturing of service - and PKI use in the science community generally - is maturing

63 References [OSCARS] For more information contact Chin Guok Also - [LHC/CMS] =global [ICFA SCIC] “Networking for High Energy Physics.” International Committee for Future Accelerators (ICFA), Standing Committee on Inter-Regional Connectivity (SCIC), Professor Harvey Newman, Caltech, Chairperson. - [E2EMON] Geant2 E2E Monitoring System –developed and operated by JRA4/WI3, with implementation done at DFN [TrViz] ESnet PerfSONAR Traceroute Visualizer