Presentation is loading. Please wait.

Presentation is loading. Please wait.

J. Bunn, D. Nae, H. Newman, S. Ravot, R. Voicu, X. Su, Y. Xia California Institute of Technology US LHCNet US LHC Network Working Group Meeting 23-24 October.

Similar presentations


Presentation on theme: "J. Bunn, D. Nae, H. Newman, S. Ravot, R. Voicu, X. Su, Y. Xia California Institute of Technology US LHCNet US LHC Network Working Group Meeting 23-24 October."— Presentation transcript:

1 J. Bunn, D. Nae, H. Newman, S. Ravot, R. Voicu, X. Su, Y. Xia California Institute of Technology US LHCNet US LHC Network Working Group Meeting 23-24 October 2006 - FNAL

2 US LHCNet Working Methods Production Network Develop and build next generation networks Networks for Research D0, CDF, BaBar, CMS, Atlas GRID applications PPDG/iVDGL, OSG, WLCG, DISUN Interconnection of US and EU Grid domains VRVS/EVO High performance High bandwidth Reliable network HEP & DoE Roadmaps Testbed for Grid Development Pre-Production N x 10 Gbps transatlantic testbed New Data transport protocols Interface and kernel setting HOPI / UltraScience Net / Ultralight / CHEPREO / LambdaSation Lightpath technologies Vendor Partnerships

3 US LHCNet ALB ATL HUB Aus. Europe SDG AsiaPac SEA Major DOE Office of Science Sites High-speed cross connects with Internet2/Abilene SNV Europe Japan USNet 10 Gbps circuit based transport. (DOE funded project) Major international Production IP ESnet core, 10 Gbps enterprise IP traffic Japan Aus. MAN Rings 10Gb/s NYC CHI LHCNet Data Network (10 Gb/s) DC GEANT2 SURFNet FNAL BNL ≥ 2.5 Gb/s  Connections to ESnet Hubs in New-York and Chicago  Redundant “light-paths” to BNL and FNAL  Access to USNet for R&D CERN

4 Challenges: Network Infrastructure u Increased capacity  Accommodate the large and increasing amount of data that must reach the HEP DoE Labs.  Also ensure data can reach the Tier2s where much, or most of the data analysis will be performed u High network reliability  Essential for distributed applications  Availability requirement: 99.95% u LHC OPN  “Virtual Private Circuits” to FNAL and BNL “Providing the right cost-effective transatlantic network infrastructure for the HEP community’s needs”

5 Challenges: Services and Integration u New network services to provide bandwidth guarantees  Provide for data transfer deadlines for remote data analysis  Traffic isolation for unfriendly data transport protocols  Remote Operations Center at Fermilab u Integration  Match network bandwidth management services to the storage systems and management methods at the US Tier1s  Also at other Tier1s and Tier2s  Match with overall data production and analysis-application needs of the LHC experiments as they develop and ramp up  Work with the US LHC Network WG, the ATLAS and CMS Network Working Groups, and the WLCG to achieve these aims “Providing the right cost-effective transatlantic network infrastructure for the HEP community’s needs”

6 LHCNet services u Redundant Peerings with ESNet and Abilene  At Chicago and New-York u Access to US Tier1s  BNL (ESnet LIMAN), FNAL (ESnet CHIMAN, lighpath)  Redundant L2 VPN u LHC Optical Private Network (LHCOPN) compliant u CERN Commodity Internet u Connectivity to Asia (Taiwan, Korea)  Backup path for CERN-Taiwan traffic u Technology  IPv4 & IPv6 ; Layer2 VPN ; QoS ; Scavenger ; large MTU (9k); monitoring (MonALISA) u Clean separation of production and R&D traffic based on Layer 2 circuits.  Highest priority to production

7 LHCNet configuration (July 2006)  Co-operated by Caltech and CERN engineering teams  Force10 platforms, 10GE WANPHY  New PoP in NY since Sept. 2005  10 Gbps path to BNL since April 2006  Connection to US Universities via UltraLight (NSF & university funded) backbone

8 BNL and Long Island MAN Ring - Feb., 2006 to 2008 ESnet demarc Cisco 6509 LI MAN – Diverse dual core connection 32 AoA, NYC Brookhaven National Lab, Upton, NY 10GE circuit 10GE circuit BNL IP gateway Chicago (Syracuse – Cleveland – Chicago) ESnet SDN core Washington 10 Gb/s circuits International USLHCnet circuits (proposed) production IP core SDN/provisioned virtual circuits 2007 circuits/facilities other NYC site ESnet SDN core switch LI MAN DWDM MAN LAN GEANT CERN Chicago (ESnet IP core) Europe Washington (ESnet IP core) Abilene NYSERNet SINet (Japan) CANARIE (Canada) HEANet (Ireland) Qatar DWDM ring (KeySpan Communications) SDN core DWDM 2007 or 2008 2007 2006 T320 ESnet IP core 2007 or 2008 second MAN switch USLHCnet Chi USLHCnet ESnet MAN ESnet MAN 2007 or 2008

9 FNAL and Chicago MAN Ring FNAL ESnet site gateway router site equip. Qwest hub (NBC Bld.) Starlight ESnet IP core FNAL Shared w/ IWire ESnet FNAL CERN ESnet Switch/RTR ESnet/ Qwest ORNL OC192 T320 NRL, UltraScienceNet, etc. T320 ESnet SDN core All circuits are 10Gb/s ESnet fiber (single lambda) Notes Qwest IWire FNAL Starlight sw LHCNet CERN (Via MANLAN)

10 LHCNet pictures Cisco 7609 (UltraLight- NSF) Linux Farm for R&D Production Servers Management Out of band access Monitoring StarLightMANLAN Force 10 Management Out of band access Monitoring Cisco 7606 (UltraLight- NSF) Force 10

11 Future backbone topology (Jan 2007) LHCNet 10 Gb/s circuit (existing) LHCNet 10 Gb/s circuit (FY2007) “Other” Circuits IRNC#1 New-York Gloriad Surfnet IRNC#2 Chicago Amsterdam Geneva u GVA-CHI-NY triangle u New PoP in Amsterdam  GEANT2 circuit between GVA and AMS should reduce bandwidth costs  Access to other transatlantic circuits  backup paths and additional capacity  Connection to Netherlight, GLIF (T1-T1 traffic and R&D) u Call for tender issued by Caltech in May (8 of 10 replied)  Replacement of existing circuits wherever that is more cost-effective

12 Atlantic Ocean Multiple Fiber Paths: Reliability Through Diversity NY 111 8th Pottington VSNL South NY 60 Hudson Highbridge (UK) VSNL North AMS-SARA AC-2 Whitesands GVA-CERN Frankfort VSNL WAL London Global Crossing Qwest Colt GEANT NY-MANLAN CHI-Starlight  Unprotected circuits (lower cost)  Service availability from provider’s offers:  Colt Target Service Availability is 99.5%  Global Crossing guarantees Wave Availability at 98%

13 US LHCNet Organisation

14 Network Team (1)  H. Newman  Network Strategic Planning, Design, Liaison and Direction  Engineer #1 (Based at CERN)  Network element configuration, operation  Bandwidth management techniques in order to optimize the use of the network  Pre-production infrastructure available for Grid and high speed data transfer developments  Engineer #2 (Based at CERN)  Supervises the day-to-day operation of the network and coordinates pre-production activities  Design, engineering and installation  Active member of the LHCOPN working group, the Internet2 HENP working group and the ICFA/SCIC group

15 Network Team (2)  Engineer #3 (Based at Caltech or CERN)  Help with daily operation, installations and upgrades  Study, evaluate and help implement reservation, provisioning and scheduling mechanisms to take advantage of circuit-switched services  Engineer #4 (Based at Caltech)  Daily US LHCnet operation; emphasis on routing and peering issue in the US  Specialist in end-systems configuration and tuning; Advice and recommendations on the system configurations  Considerable experience in the installation, configuration and operation of optical multiplexers and purely photonic switches

16 USLHCNet NOC u The CERN Network Operation Center (NOC)  Delivers the first level support 24 hours a day, 7 days a week.  Watch out for alarms  A problem not resolved immediately is escalated to the Caltech network engineering team. u USLHCnet engineers “on call” 24x7  On site (at CERN) in less than 60 min  Remote hand service at MANLAN and StarLight is available on a 24x7 basis with a four hour response time.

17 Next Generation LHCNet: Add Optical Circuit-Oriented Services Based on CIENA “Core Director” Optical Multiplexers  Robust fallback, at the optical layer  Sophisticated standards-based software: VCAT/LCAS.  Circuit-oriented services: Guaranteed Bandwidth Ethernet Private Line (EPL) Circuit oriented services  Bandwidth guarantees at flexible rate  Provide for data transfer deadlines for remote data analysis  Traffic isolation for unfriendly data transport protocols  Security

18 Pre-Production Activities u Prototype data movement services between CERN and the US  Service Challenges u High speed disk-to-disk throughput development  New end-systems (PCI-e; 64 bit cpu; New 10 GE NICs)  New data transport protocols (FAST and others)  Linux kernel patches; RPMs for deployment u Monitoring, Command and Control Services (MonALISA) u “Optical control plane” development  MonALISA services available for photonic switches  GMPLS (Generalized MPLS); G.ASON  Collaboration with Cheetah and Dragon projects u Note: Equipment loans and donations; exceptional discounts Prepare each year for the production network and usage of the following year: protocols, engineering, end-to-end monitoring

19  IPv4 Multi-stream record with FAST TCP: 6.86 Gbps X 27kkm 11/2004  PCI-X 2.0: 9.3 Gbps Caltech- StarLight: 12/05  PCI-E One stream in 2006: 9.9 Gbps Caltech – SNV (10 GbE LAN-PHY); 9.2 Gbps Caltech- Starlight on UltraScience Net  Concentrate now on reliable Terabyte-scale file transfers  Disk-to-disk Marks: 536 Mbytes/sec (Windows); 500 Mbytes/sec (Linux)  System Issues: PCI-X Bus, Network Interfaces, Disk I/O Controllers, Linux Kernel,CPU  SC2003-5: 23, 101, 151 Gbps Internet2 Land Speed Records & SC2003-2005 Records Internet2 LSRs: Blue = HEP 7.2G X 20.7 kkm Throuhgput (Petabit-m/sec)

20 Milestones: 2006-2007 u November 2006: Provisioning of new transatlantic circuits u End 2006: Evaluation of CIENA platforms  Try and buy agreement u Spring 2007: Deployment of Next-generation US LHCNet  Transition to new circuit-oriented backbone, based on optical multiplexers.  Maintain full switched and routed IP service for a controlled portion of the bandwidth u Summer 2007: Start of LHC operations

21 Primary Milestones for 2007-2010 u Provide a robust network service without service interruptions, through  Physical diversity of the primary links  Automated fallback at the optical layer  Mutual backup with other networks (ESnet, IRNC, CANARIE, SURFNet etc.) u Ramp up the bandwidth, supporting an increasing number of 1-10 Terabyte-scale flows u Scale up and increase the functionality of the network management services provided u Gain experience on policy-based network resource management, together with FNAL, BNL, and the US Tier2 organizations u Integrate with the security (AAA) infrastructures of ESnet and the LHC OPN

22 Additional Technical Milestones for 2008-2010 Targeted at large scale, resilient operation with a relatively small network engineering team u 2008: Circuit-Oriented services  Bandwidth provisioning automated (through the use of MonALISA services working with the CIENAs, for example)  Channels assigned to authenticated, authorized sites and/or user-groups  Based on a policy-driven network-management services infrastructure, currently under development u 2008-2010: The Network as a Grid resource (2008-2010)  Extend advanced planning and optimization into the networking and data-access layers.  Provides interfaces and functionality allowing physics applications to interact with the networking resources

23 Conclusion  US LHCNet: An extremely reliable, cost-effective High Capacity Network  A 20+ Year Track Record  High speed inter-connections with the major R&E networks and US T1 centers  Taking advantage of rapidly advancing network technologies to meet the needs of the LHC physics program at moderate cost  Leading edge R&D projects as required, to build the next generation US LHCNet

24 Additional Slides

25 Next Generation LHCNet: Add Optical Circuit-Oriented Services CERN-FNAL Primary EPL CERN-FNAL Secondary EPL Based on CIENA “Core Director” Optical Multiplexers  Highly reliable in production environments  Robust fallback, at the optical layer  Circuit-oriented services: Guaranteed Bandwidth Ethernet Private Line (EPL)  Sophisticated standards-based software: VCAT/LCAS.  VCAT logical channels: highly granular bandwidth management  LCAS: dynamically adjust channels  Highly scalable and cost effective, especially for many OC-192 Links  This is consistent with the directions of the other major R&E networks such as Internet2/Abilene, GEANT (pan-European), ESnet SDN

26 SC2005:Caltech and FNAL/SLAC Booths High Speed TeraByte Transfers for Physics u We previewed the global-scale data analysis of the LHC Era u We previewed the global-scale data analysis of the LHC Era Using a realistic mixture of streams:  Organized transfer of multi-TB event datasets; plus  Numerous smaller flows of physics data that absorb the remaining capacity u u We used Twenty Two [*] 10 Gbps waves to carry bidirectional traffic between Fermilab, Caltech, SLAC, BNL, CERN and other partner Grid sites including: Michigan, Florida, Manchester, Rio de Janeiro (UERJ) and Sao Paulo (UNESP) in Brazil, Korea (Kyungpook), and Japan (KEK) [*] 15 10 Gbps wavellengths at the Caltech/CACR Booth and 7 10 GBps wavelengths at the FNAL/SLAC Booth u u Monitored by Caltech’s MonALISA global monitoring system u Bandwidth challenge award:  Official mark at 131 Gbps but peaks at 151 Gbps  475 TB of physics data transferred in a day; Sustained rate > 1 PB/day

27 150 Gigabit Per Second Mark

28 Next Generation LHCNet u Circuit oriented services  Bandwidth guarantees at flexible rate  Provide for data transfer deadlines for remote data analysis  Traffic isolation for unfriendly data transport protocols  Security u CIENA Platforms  OSRP Optical Signaling and Routing Protocol  Distributed signal and routing protocol which abstracts physical network resources  Based on G.ASON  Advertise topology information and capacity availability  Connection management (Provisioning/Restoration)  Resource discovery and maintenance  Ethernet Private Line (EPL) – Point-to-Point  Dedicated bandwidth tunnels: guaranteed end-to-end performance  VCAT/LCAS/GFP-F allows for resilient, right-sized tunnels  Automated end-to-end provisioning u Technology and bandwidth roadmap in line with ESNet (SDN), Internet2 (HOPI/NEWNET) and GEANT plans

29 Next Generation LHCNet: Add Optical Circuit-Oriented Services CERN-FNAL Primary EPL CERN-FNAL Secondary EPL Force10 switches for Layer 3 services

30 LHCNet connection to Proposed ESnet Lambda Infrastructure Based on National Lambda Rail: FY09/FY10 NLR wavegear sites NLR regeneration / OADM sites ESnet via NLR (10 Gbps waves) LHCNet (10 Gbps waves) Denver Seattle Sunnyvale LA San Diego Chicago Pitts Wash DC Raleigh Jacksonville Atlanta KCKC Baton Rouge El Paso - Las Cruces Phoenix Pensacola Dallas San Ant. Houston Albuq. Tulsa New York Clev Boise CERN (Geneva)  LHCNet: To ~80 Gbps by 2009-10  Routing + Dynamic managed circuit provisioning

31 UltraLight

32 UltraLight: Developing Advanced Network Services for Data Intensive HEP Applications  UltraLight: a next-generation hybrid packet- and circuit- switched network infrastructure  Packet switched: cost effective solution; requires ultrascale protocols to share 10G efficiently and fairly  Circuit-switched: Scheduled or sudden “overflow” demands handled by provisioning additional wavelengths; Use path diversity, e.g. across the US, Atlantic, Canada,…  Extend and augment existing grid computing infrastructures (currently focused on CPU/storage) to include the network as an integral component  Using MonALISA to monitor and manage global systems  Partners: Caltech, UF, FIU, UMich, SLAC, FNAL, MIT/Haystack; CERN, NLR, CENIC, Internet2, FLR; UERJ and USP (Brazil); Translight, UKLight, Netherlight; UvA, UCL, KEK, Taiwan  Strong support from Cisco  NSF funded

33 LambdaStation  A Joint Fermilab and Caltech project  Enabling HEP applications to send high throughput traffic between mass storage systems across advanced network paths  Dynamic Path Provisioning across Ultranet, NLR; Plus an Abilene “standard path”  DOE funded

34 SC2005: Switch and ServerInterconnections at the Caltech Booth (#428) 15 10G Waves 64 10G Switch Ports: 2 Fully Populated Cisco 6509Es 43 Neterion 10 GbE NICs 70 nodes with 280 Cores 200 SATA Disks 40 Gbps (20 HBAs) to StorCloud Thursday - Sunday http://monalisa-ul.caltech.edu:8080/stats?page=nodeinfo_sys

35 BNL-CERN Connectivity USLHCNET  Multiple VLANs  Working now  Not the most robust or scalable way to provide backup and automatic fallback

36 FNAL-CERN Connectivity USLHCNET  Multiple VLANs  Working now  Not the most robust or scalable way to provide backup and automatic fallback

37 Site gateway router Starlight/MANLAN Site Large Science Site ESnet Metropolitan Area Network Rings: Reliability, Scalability, and Performance T320 Site edge router IP core router ESnet SDN core hub ESnet switch SDN circuits to site systems IP Peers Virtual Circuit to Site SDN core west SDN core east IP core west IP core east SDN core switch USLHCnet POP MAN switch managing multiple lambdas Site LAN Site router Primary Path to CERN Secondary Path to CERN ESnet managed virtual circuit services tunneled through the IP backbone ESnet managed λ / circuit services ESnet production IP service Froce10 CIENA ESnet managed λ / circuit services

38 VINCI: Real-World Working Example: Agents Create an Optical Path on Demand Dynamic restoration of a lightpath if a segment has problems http://monalisa.cacr.caltech.edu/monalisa__Service_Applications__Vinci.htmhttp://monalisa.cacr.caltech.edu/monalisa__Service_Applications__Vinci.htm

39 Monitoring: http://monalisa.caltech.edu  Operations & management assisted by agent-based software (MonALISA) 500 TB of data sent from CERN to FNAL over the last two months  500 TB of data sent from CERN to FNAL over the last two months

40 LHCNet Utilization during Service Challenge  CERN-FNAL traffic during SC3 (April 2006)  Disk-to-Disk u Service challenge  Achieving the goal of a production quality world-wide Grid that meets the requirements of LHC experiments  Prototype the data movement services  Acquire an understanding of how the entire system performs when exposed to the level of usage we expect during LHC running

41 Circuit failure during SC2


Download ppt "J. Bunn, D. Nae, H. Newman, S. Ravot, R. Voicu, X. Su, Y. Xia California Institute of Technology US LHCNet US LHC Network Working Group Meeting 23-24 October."

Similar presentations


Ads by Google