Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Energy Sciences Network BESAC August 2004

Similar presentations


Presentation on theme: "The Energy Sciences Network BESAC August 2004"— Presentation transcript:

1 The Energy Sciences Network BESAC August 2004
Mary Anne Scott Program Manager Advanced Scientific Computing Research Office of Science Department of Energy William E. Johnston, ESnet Dept. Head and Senior Scientist R. P. Singh, Federal Project Manager Michael S. Collins, Stan Kluz, Joseph Burrescia, and James V. Gagliardi, ESnet Leads Gizella Kapus, Resource Manager and the ESnet Team Lawrence Berkeley National Laboratory

2 What is ESnet? Mission: Vision: Role:
Provide, interoperable, effective and reliable communications infrastructure and leading-edge network services that support missions of the Department of Energy, especially the Office of Science Vision: Provide seamless and ubiquitous access, via shared collaborative information and computational environments, to the facilities, data, and colleagues needed to accomplish their goals. Role: A component of the Office of Science infrastructure critical to the success of its research programs (program funded through ASCR/MICS; managed and operated by ESnet staff at LBNL).

3 Why is ESnet important? Enables thousands of DOE, university and industry scientists and collaborators worldwide to make effective use of unique DOE research facilities and computing resources independent of time and geographic location Direct connections to all major DOE sites Access to the global Internet (managing 150,000 routes at 10 commercial peering points) User demand has grown by a factor of more than 10,000 since its inception in the mid 1990’s—a 100 percent increase every year since 1990 Capabilities not available through commercial networks Architected to move huge amounts of data between a small number of sites High bandwidth peering to provide access to US, European, Asia-Pacific, and other research and education networks. The way science is done has changed dramatically in the past 10 years. Some people describe it as “Science is a team sport.” These changes lead to an increase in scientific productivity –shorter turn around time for disseminating, assimilating and testing new ideas Basic services – , file transfer, remote login, distributed file systems Teleconferencing Remote access to unique facilities (experiments, supercommputers, databases, installed codes) Objective: Support scientific research by providing seamless and ubiquitous access to the facilities, data, and colleagues

4 How is ESnet Managed? A community endeavor
Strategic guidance from the OSC programs Energy Science Network Steering Committee (ESSC) BES represented by Nestor Zaluzec, ANL and Jeff Nichols, ORNL Network operation is a shared activity with the community ESnet Site Coordinators Committee Ensures the right operational “sociology” for success Complex and specialized – both in the network engineering and the network management – in order to provide its services to the laboratories in an integrated support environment Extremely reliable in several dimensions Taken together these points make ESnet a unique facility supporting DOE science that is quite different from a commercial ISP or University network

5 …what now??? VISION - A scalable, secure, integrated network environment for ultra-scale distributed science is being developed to make it possible to combine resources and expertise to address complex questions that no single institution could manage alone. Network Strategy Production network Base TCP/IP services; +99.9% reliable High-impact network Increments of 10 Gbps; switched lambdas (other solutions); 99% reliable Research network Interfaces with production, high-impact and other research networks; start electronic and advance towards optical switching; very flexible [UltraScience Net] Revisit governance model SC-wide coordination Advisory Committee involvement

6 Early identification of requirements
Where do you come in? Early identification of requirements Evolving programs New facilities Participation in management activities Interaction with BES representatives on ESSC Next ESSC meeting on Oct in DC area

7 What Does ESnet Provide?
A network connecting DOE Labs and their collaborators that is critical to the future process of science An architecture tailored to accommodate DOE’s large-scale science move huge amounts of data between a small number of sites High bandwidth access to DOE’s primary science collaborators: Research and Education institutions in the US, Europe, Asia Pacific, and elsewhere Full access to the global Internet for DOE Labs Comprehensive user support, including “owning” all trouble tickets involving ESnet users (including problems at the far end of an ESnet connection) until they are resolved – 24x7 coverage Grid middleware and collaboration services supporting collaborative science trust, persistence, and science oriented policy

8 What is ESnet Today? Essentially all of the national data traffic supporting US science is carried by two networks – ESnet and Internet-2 / Abilene (which plays a similar role for the university community)

9 How Do Networks Work? Accessing a service, Grid or otherwise, such as a Web server, FTP server, etc., from a client computer and client application (e.g. a Web browser_ involves Target host names Host addresses Service identification Routing

10 border/gateway routers
How Do Networks Work? core routers focus on high-speed packet forwarding LBNL router ESnet (Core network) core router router border router core router peering routers Exchange reachability information (“routes”) implement/enforce routing policy for each provider provide cyberdefense gateway router peering router DNS border/gateway routers implement separate site and network provider policy (including site firewall policy) peering router router router Big ISP (e.g. SprintLink) router router router router Google, Inc.

11 ESnet Core is a High-Speed Optical Network
ESnet site site LAN Site – ESnet network policy demarcation (“DMZ”) Site IP router RTR ESnet hub ESnet IP router RTR Wave division multiplexing today typically 64 x 10 Gb/s optical channels per fiber channels (referred to as “lambdas”) are usually used in bi-directional pairs Lambda channels are converted to electrical channels usually SONET data framing or Ethernet data framing can be clear digital channels (no framing – e.g. for digital HDTV) 10GE 10GE ESnet core optical fiber ring A ring topology network is inherently reliable – all single point failures are mitigated by routing traffic in the other direction around the ring. RTR RTR RTR RTR

12 ESnet core: Packet over SONET Optical Ring and Hubs
ESnet Provides Full Internet Service to DOE Facilities and Collaborators with High-Speed Access to all Major Science Collaborators Australia CA*net4 Taiwan (TANet2) Singaren CA*net4 KDDI (Japan) France Switzerland Taiwan (TANet2) GEANT - Germany - France - Italy - UK - etc Sinet (Japan) Japan – Russia(BINP) CA*net4 MREN Netherlands Russia StarTap Taiwan (ASCC) CERN PNWG SEA HUB LIGO PNNL ESnet IP MAN LAN Abilene Starlight Japan Abilene NYC HUB MIT TWC Abilene INEEL BNL SNLL Chi NAP ANL-DC INEEL-DC ORAU-DC LLNL/LANL-DC JGI NY-NAP QWEST ATM FNAL LLNL AMES PPPL ANL LBNL SNV HUB MAE-E CHI HUB NERSC 4xLAB-DC SLAC MAE-W GTN&NNSA PAIX-E BECHTEL Fix-W NREL KCP DC HUB JLAB PAIX-W YUCCA MT ORNL ORAU Euqinix SDSC LANL OSTI ARM Abilene ALB HUB SNLA NOAA SRS 42 end user sites GA Allied Signal PANTEX Office Of Science Sponsored (22) ATL HUB DOE-ALB NNSA Sponsored (12) International (high speed) OC192 (10G/s optical) OC48 (2.5 Gb/s optical) Gigabit Ethernet (1 Gb/s) OC12 ATM (622 Mb/s) OC12 OC3 (155 Mb/s) T3 (45 Mb/s) T1-T3 T1 (1 Mb/s) ELP HUB Joint Sponsored (3) Other Sponsored (NSF LIGO, NOAA) Laboratory Sponsored (6) peering points ESnet core: Packet over SONET Optical Ring and Hubs SNV HUB Abilene hubs high-speed peering points

13 ESnet’s Peering Infrastructure Connects the DOE Community With its Collaborators
CA*net4 CERN MREN Netherlands Russia StarTap Taiwan (ASCC) Australia CA*net4 Taiwan (TANet2) Singaren GEANT - Germany - France - Italy - UK - etc SInet (Japan) KEK Japan – Russia (BINP) KDDI (Japan) France PNW-GPOP SEA HUB 2 PEERS Distributed 6TAP 19 Peers STARLIGHT Abilene Japan CalREN2 1 PEER CHI NAP NYC HUBS 1 PEER NY-NAP LBNL Abilene + 7 Universities SNV HUB Abilene MAE-E 2 PEERS 5 PEERS PAIX-W 26 PEERS PAIX-E MAX GPOP MAE-W 39 PEERS 22 PEERS EQX-ASH FIX-W 20 PEERS 3 PEERS EQX-SJ 6 PEERS GA LANL CENIC SDSC Abilene ATL HUB TECHnet ESnet provides access to all of the Internet by managing the full complement of Global Internet routes (about 150,000) at 10 general/commercial peering points + high-speed peerings w/ Abilene and the international R&E networks. This is a lot of work, and is very visible, but provides full access for DOE. ESnet Peering (connections to other networks) University International Commercial

14 ESnet daily peering report (top 20 of about 100)
What is Peering? AS routes peer 1239 63384 SPRINTLINK 701 51685 UUNET-ALTERNET 209 47063 QWEST 3356 41440 LEVEL3 3561 35980 CABLE-WIRELESS 7018 28728 ATT-WORLDNET 2914 19723 VERIO 3549 17369 GLOBALCENTER 5511 8190 OPENTRANSIT 174 5492 COGENTCO 6461 5032 ABOVENET 7473 4429 SINGTEL 3491 3529 CAIS 11537 3327 ABILENE 5400 3321 BT 4323 2774 TWTELECOM 4200 2475 ALERON 6395 2408 BROADWING 2828 2383 XO 7132 1961 SBC Peering points exchange routing information that says “which packets I can get closer to their destination” ESnet daily peering report (top 20 of about 100) This is a lot of work peering with this outfit is not random, it carries routes that ESnet needs (e.g. to the Russian Backbone Net)

15 What is Peering? Why so many routes? So that when I want to get to someplace out of the ordinary, I can get there. For example: (Technological Design Institute of Applied Microelectronics, Novosibirsk, Russia) Peering routers Start: snv-lbl-oc48.es.net ESnet core snvrt1-ge0-snvcr1.es.net ESnet peering at Sunnyvale pos3-0.cr01.sjo01.pccwbtn.net AS3491 CAIS Internet pos5-1.cr01.chc01.pccwbtn.net “ “ pos6-1.cr01.vna01.pccwbtn.net pos5-3.cr02.nyc02.pccwbtn.net pos6-0.cr01.ldn01.pccwbtn.net rbnet.pos4-1.cr01.ldn01.pccwbtn.net AS3491->AS5568 (Russian Backbone Network) peering point MSK-M9-RBNet-5.RBNet.ru Russian Backbone Network MSK-M9-RBNet-1.RBNet.ru NSK-RBNet-2.RBNet.ru Finish: Novosibirsk-NSC-RBNet.nsc.ru RBN to AS 5387 (NSCNET-2)

16 Predictive Drivers for the Evolution of ESnet
August 13-15, 2002 Organized by Office of Science Mary Anne Scott, Chair Dave Bader Steve Eckstrand Marvin Frazier Dale Koelling Vicky White Workshop Panel Chairs Ray Bair and Deb Agarwal Bill Johnston and Mike Wilde Rick Stevens Ian Foster and Dennis Gannon Linda Winkler and Brian Tierney Sandy Merola and Charlie Catlett The network is needed for: long term (final stage) data analysis “control loop” data analysis (influence an experiment in progress) distributed, multidisciplinary simulation The network and middleware requirements to support DOE science were developed by the OSC science community representing major DOE science disciplines Climate Spallation Neutron Source Macromolecular Crystallography High Energy Physics Magnetic Fusion Energy Sciences Chemical Sciences Bioinformatics Available at

17 The Analysis was Driven by the Evolving Process of Science
Feature Vision for the Future Process of Science Characteristics that Motivate High Speed Nets Requirements Discipline Networking Middleware Climate (near term) Analysis of model data by selected communities that have high speed networking (e.g. NCAR and NERSC) A few data repositories, many distributed computing sites NCAR - 20 TBy NERSC - 40 TBy ORNL - 40 TBy Authenticated data streams for easier site access through firewalls Server side data processing (computing and cache embedded in the net) Information servers for global data catalogues (5 yr) Enable the analysis of model data by all of the collaborating community Add many simulation elements/components as understanding increases 100 TBy / 100 yr generated simulation data, 1-5 PBy / yr (just at NCAR) Distribute large chunks of data to major users for post-simulation analysis Robust access to large quantities of data Reliable data/file transfer (across system / network failures) (5-10 yr) Integrated climate simulation that includes all high-impact factors 5-10 PBy/yr (at NCAR) Add many diverse simulation elements/components, including from other disciplines - this must be done with distributed, multidisciplinary simulation Virtualized data to reduce storage load Robust networks supporting distributed simulation - adequate bandwidth and latency for remote analysis and visualization of massive datasets Quality of service guarantees for distributed, simulations Virtual data catalogues and work planners for reconstituting the data on demand analysis was driven by

18 Evolving Quantitative Science Requirements for Networks
Science Areas Today End2End Throughput 5 years End2End Throughput 5-10 Years End2End Throughput Remarks High Energy Physics 0.5 Gb/s 100 Gb/s 1000 Gb/s high bulk throughput Climate (Data & Computation) Gb/s N x 1000 Gb/s SNS NanoScience Not yet started 1 Gb/s 1000 Gb/s + QoS for control channel remote control and time critical throughput Fusion Energy 0.066 Gb/s (500 MB/s burst) 0.198 Gb/s (500MB/ 20 sec. burst) time critical throughput Astrophysics 0.013 Gb/s (1 TBy/week) N*N multicast computational steering and collaborations Genomics Data & Computation 0.091 Gb/s (1 TBy/day) 100s of users high throughput and steering

19 Observed Drivers for ESnet Evolution
Are we seeing the predictions of two years ago come true? Yes!

20 OSC Traffic Increases by 1.9-2.0 X Annually
ESnet is currently transporting about 250 terabytes/mo. (250,000,000 MBy/mo.) ESnet Monthly Accepted Traffic TBytes/Month Annual growth in the past five years has increased from 1.7x annually to just over 2.0x annually.

21 ESnet Top 20 Data Flows, 24 hr. avg., 2004-04-20
ESnet is Engineered to Move a Lot of Data ESnet Top 20 Data Flows, 24 hr. avg., A small number of science users account for a significant fraction of all ESnet traffic SLAC (US)  IN2P3 (FR) 1 Terabyte/day Fermilab (US)  CERN SLAC (US)  INFN Padva (IT) Fermilab (US)  U. Chicago (US) U. Toronto (CA)  Fermilab (US) CEBAF (US)  IN2P3 (FR) Helmholtz-Karlsruhe (DE) SLAC (US) INFN Padva (IT)  SLAC (US) DOE Lab  DOE Lab SLAC (US)  JANET (UK) Fermilab (US)  JANET (UK) DOE Lab  DOE Lab Argonne (US)  Level3 (US) Argonne  SURFnet (NL) IN2P3 (FR)  SLAC (US) Fermilab (US)  INFN Padva (IT) Since BaBar data analysis started, the top 20 ESnet flows have consistently accounted for > 50% of ESnet’s monthly total traffic (~130 of 250 TBy/mo)

22 ESnet Top 10 Data Flows, 1 week avg., 2004-07-01
The traffic is not transient: Daily and weekly averages are about the same. SLAC is a prototype for what will happen when Climate, Fusion, SNS, Astrophysics, etc., start to ramp up the next generation science SLAC (US)  INFN Padua (IT) 5.9 Terabytes SLAC (US)  IN2P3 (FR) 5.3 Terabytes FNAL (US)  IN2P3 (FR) 2.2 Terabytes FNAL (US)  U. Nijmegen (NL) 1.0 Terabytes SLAC (US) Helmholtz-Karlsruhe (DE) 0.9 Terabytes CERN  FNAL (US) 1.3 Terabytes U. Toronto (CA)  Fermilab (US) 0.9 Terabytes FNAL (US) Helmholtz-Karlsruhe (DE) 0.6 Terabytes FNAL (US) SDSC (US) 0.6 Terabytes U. Wisc. (US) FNAL (US) 0.6 Terabytes ESnet Top 10 Data Flows, 1 week avg.,

23 ESnet is a Critical Element of Large-Scale Science
ESnet is a critical part of the large-scale science infrastructure of high energy physics experiments, climate modeling, magnetic fusion experiments, astrophysics data analysis, etc. As other large-scale facilities – such as SNS – turn on, this will be true across DOE

24 Science Mission Critical Infrastructure
ESnet is a visible and critical piece of general DOE science infrastructure if ESnet fails, tens of thousands of DOE and University users know it within minutes if not seconds Requires high reliability and high operational security in the network operations, and ESnet infrastructure support – the systems that support the operation and management of the network and services Secure and redundant mail and Web systems are central to the operation and security of ESnet trouble tickets are by engineering communication by engineering database interface is via Web Secure network access to Hub equipment Backup secure telephony access to all routers 24x7 help desk (joint w/ NERSC) and 24x7 on-call network engineers

25 Automated, real-time monitoring of traffic levels and operating state of some 4400 network entities is the primary network operational and diagnosis tool Performance Network Configuration OSPF Metrics (routing and connectivity) Hardware Configuration SecureNet IBGP Mesh (routing and connectivity) Physical Topology documents devices & their connections including interface names & addresses core Map shows our connections to Qwest. SecureNet map shows the PVP’s in use between SecureNet sites & encapsulation points. OSPF map shows how we have manually set OSPF metrics to optimize routing. IBGP map shows where we are using full meshing and where we are using route reflection. LANWAN system is another interface into all the site diagrams showing equipment & interconnections at each site.

26 ESnet’s Physical Infrastructure
Equipment rack detail at NYC Hub, 32 Avenue of the Americas (one of ESnet’s core optical ring sites) Picture detail

27 Typical Equipment of an ESnet Core Network Hub
Qwest DS3 DCX Sentry power 48v 30/60 amp panel ($3900 list) AOA Performance Tester ($4800 list) Sentry power 48v 10/25 amp panel ($3350 list) DC / AC Converter ($2200 list) Cisco 7206 AOA-AR1 (low speed links to MIT & PPPL) ($38,150 list) Lightwave Secure Terminal Server ($4800 list) ESnet core Qwest 32 AofA HUB NYC, NY (~$1.8M, list) Juniper T320 AOA-CR1 (Core router) ($1,133,000 list) Juniper OC192 Optical Ring Interface (the AOA end of the OC192 to CHI ($195,000 list) Juniper OC48 Optical Ring Interface (the AOA end of the OC48 to DC-HUB ($65,000 list) Juniper M20 AOA-PR1 (peering RTR) ($353,000 list)

28 Disaster Recovery and Stability
Engineers, 24x7 Network Operations Center, generator backed power Spectrum (net mgmt system) DNS (name – IP address translation) Eng database Load database Config database Public and private Web (server and archive) PKI cert. repository and revocation lists collaboratory authorization service Remote Engineer partial duplicate infrastructure SEA HUB DNS BNL CHI HUB NYC HUBS AMES PPPL LBNL DC HUB TWC SNV HUB Remote Engineer Duplicate Infrastructure Currently deploying full replication of the NOC databases and servers and Science Services databases in the NYC Qwest carrier hub Remote Engineer partial duplicate infrastructure ALB HUB ATL HUB ELP HUB The network must be kept available even if, e.g., the West Coast is disabled by a massive earthquake, etc. Reliable operation of the network involves remote NOCs replicated support infrastructure generator backed UPS power at all critical network and infrastructure locations non-interruptible core - ESnet core operated without interruption through N. Calif. Power blackout of 2000 the 9/11/2001 attacks, and the Sept., 2003 NE States power blackout

29 ESnet WAN Security and Cyberattack Defense
Cyber defense is a new dimension of ESnet security Security is now inherently a global problem As the entity with a global view of the network, ESnet has an important role in overall security 30 minutes after the Sapphire/Slammer worm was released, 75,000 hosts running Microsoft's SQL Server (port 1434) were infected. (“The Spread of the Sapphire/Slammer Worm,” David Moore (CAIDA & UCSD CSE), Vern Paxson (ICIR & LBNL), Stefan Savage (UCSD CSE), Colleen Shannon (CAIDA), Stuart Staniford (Silicon Defense), Nicholas Weaver (Silicon Defense & UC Berkeley EECS) ) Jan., 2003

30 ESnet and Cyberattack Defense
Sapphire/Slammer worm infection hits creating almost a full Gb/s (1000 megabit/sec.) traffic spike on the ESnet backbone

31 Cyberattack Defense X X ESnet LBNL X Lab Lab
ESnet third response – shut down the main peering paths and provide only limited bandwidth paths for specific “lifeline” services ESnet first response – filters to assist a site ESnet second response – filter traffic from outside of ESnet peering router X X router ESnet router LBNL attack traffic router X border router Lab first response – filter incoming traffic at their ESnet gateway router gateway router peering router border router Lab gateway router Sapphire/Slammer worm infection created a Gb/s of traffic on the ESnet core until filters were put in place (both into and out of sites) to damp it out. Lab

32 Science Services: Support for Shared, Collaborative Science Environments
X.509 identity certificates and Public Key Infrastructure provides the basis of secure, cross-site authentication of people and systems ( ESnet negotiates the cross-site, cross-organization, and international trust relationships to provide policies that are tailored to collaborative science in order to permit sharing computing and data resources, and other Grid services Certification Authority (CA) issues certificates after validating request against policy This service was the basis of the first routine sharing of HEP computing resources between US and Europe

33 Science Services: Public Key Infrastructure
* Report as of July 15,2004

34 Voice, Video, and Data Tele-Collaboration Service
Another highly successful ESnet Science Service is the audio, video, and data teleconferencing service to support human collaboration Seamless voice, video, and data teleconferencing is important for geographically dispersed scientific collaborators ESnet currently provides to more than a thousand DOE researchers and collaborators worldwide H.323 (IP) videoconferences (4000 port hours per month and rising) audio conferencing (2500 port hours per month) (constant) data conferencing (150 port hours per month) Web-based, automated registration and scheduling for all of these services Huge cost savings for the Labs

35 ESnet’s Evolution over the Next 10-20 Years
Upgrading ESnet to accommodate the anticipated increase from the current 100%/yr traffic growth to 300%/yr over the next 5-10 years is priority number 7 out of 20 in DOE’s “Facilities for the Future of Science – A Twenty Year Outlook” Based on the requirements of the OSC Network Workshops, ESnet must address Capable, scalable, and reliable production IP networking University and international collaborator connectivity Scalable, reliable, and high bandwidth site connectivity Network support of high-impact science provisioned circuits with guaranteed quality of service (e.g. dedicated bandwidth) Science Services to support Grids, collaboratories, etc

36 New ESnet Architecture to Accommodate OSC
The future requirements cannot be met with the current, telecom provided, hub and spoke architecture of ESnet Chicago (CHI) New York (AOA) ESnet Core DOE sites Washington, DC (DC) Sunnyvale (SNV) Atlanta (ATL) El Paso (ELP) The core ring has good capacity and resiliency against single point failures, but the point-to-point tail circuits are neither reliable nor scalable to the required bandwidth

37 Evolving Requirements for DOE Science Network Infrastructure
In the near term applications need higher bandwidth high bandwidth QoS high bandwidth and QoS network resident cache and compute elements robust bandwidth (multiple paths) S S C C guaranteed bandwidth paths 1-40 Gb/s, end-to-end I I 1-3 yr Requirements 2-4 yr Requirements C C C storage C S S S C compute I instrument cache & compute C&C S C C S I C&C C&C C&C I C&C C&C C&C 3-5 yr Requirements C C&C 4-7 yr Requirements Gb/s, end-to-end C S

38 ESnet new architecture goals: full redundant connectivity for every site and high-speed access for every site (at least 10 Gb/s) Two part strategy 1) MAN rings provide dual site connectivity and much higher site bandwidth 2) A second backbone will provide multiply connected MAN rings for protection against hub failure extra backbone capacity a platform for provisioned, guaranteed bandwidth circuits alternate path for production IP traffic carrier neutral hubs Europe Asia-Pacific NLR (2nd Backbone) Chicago (CHI) New York (AOA) ESnet Existing Core Sunnyvale (SNV) Washington, DC (DC) Existing hubs Atlanta (ATL) New hubs El Paso (ELP) DOE/OSC sites

39 Conclusions ESnet is an infrastructure that is critical to DOE’s science mission Focused on the Office of Science Labs, but serves many other parts of DOE ESnet is working hard to meet the current and future networking need of DOE mission science in several ways: Evolving a new high speed, high reliability, leveraged architecture Championing several new initiatives which will keep ESnet’s contributions relevant to the needs of our community

40 Reference -- Planning Workshops
High Performance Network Planning Workshop, August 2002 DOE Workshop on Ultra High-Speed Transport Protocols and Network Provisioning for Large-Scale Science Applications, April 2003 Science Case for Large Scale Simulation, June 2003 DOE Science Networking Roadmap Meeting, June 2003 Workshop on the Road Map for the Revitalization of High End Computing, June 2003 (public report) ASCR Strategic Planning Workshop, July 2003 Planning Workshops-Office of Science Data-Management Strategy, March & May 2004 (report coming soon)


Download ppt "The Energy Sciences Network BESAC August 2004"

Similar presentations


Ads by Google