Presentation is loading. Please wait.

Presentation is loading. Please wait.

Network Engineering Services Group Update Patrick Dorn, Network Engineer ESnet Network Engineering Group ESCC July 15 2013.

Similar presentations


Presentation on theme: "Network Engineering Services Group Update Patrick Dorn, Network Engineer ESnet Network Engineering Group ESCC July 15 2013."— Presentation transcript:

1 Network Engineering Services Group Update Patrick Dorn, Network Engineer ESnet Network Engineering Group ESCC July 15 2013

2 Lawrence Berkeley National LaboratoryU.S. Department of Energy | Office of Science Outline Operational work completed since January 100G Testbed Status ANA-100G BayExpress Alien Wave Testing Current network snapshot 100G testing Operational work in progress

3 Lawrence Berkeley National LaboratoryU.S. Department of Energy | Office of Science Since January ANL and ORNL 100G connections in production (April) − Production meaning in use for primary IP peering and available for OSCARS circuits 100G production connections to exchange points and R&E peers: − MANLAN, Starlight, WIX, CENIC (PACWAVE) − Internet2: Sunnyvale, Chicago, New York (via MANLAN), DC (via WIX) ESnet5 consolidation across the national footprint, increasing consistency, reducing router count, power consumption and maintenance costs. For example: − Decommissioned OC48 between Atlanta and ORNL − Removed aofa-cr2, star-sdn1, star-cr1, ornl-rt2, pnwg-sdn1 − Eliminated MX-ALU interconnect bottlenecks at the hubs − Removed all DNS entries referring to “ANI”

4 Lawrence Berkeley National LaboratoryU.S. Department of Energy | Office of Science Since January Swapped out our un-supported “third party” 10x10 MSA (aka “LR- 10”) CFPs in our Ciena interfaces with Ciena OEM’d CFPs − Covered by 4-hour replacement maintenance contract − Full PM / stats support We were able to extract additional value from the third party optics by using a pair of them in the TA Testbed in a test configuration at BNL. Normalized 100G Testbed infrastructure − Testbed ALUs reconfigured from scratch − Moved out of ESnet backbone IP space − Moved out of ESnet ASN (from 293 to 3432)

5 Lawrence Berkeley National LaboratoryU.S. Department of Energy | Office of Science ESnet 100G Testbed (AS3432) ESnet 100G Testbed (AS3432) ESnet5 (AS293) ESnet5 (AS293) star-tb1star-cr5nersc-mr2nersc-tb1 1/1/1:5 [to_star-cr5] (192.124.50.21/30) (2001:400:E11E:4::2/127) 6/1/1:5 [to_star-tb1] (192.124.50.22/30 ) (2001:400:E11E:4::3/127 ) ESnet 100G Testbed Topology 2x10G100G system: 192.124.57.7/32 2001:400:E100::7/12 8 system: 192.124.57.8/32 2001:400:E100::8/128 100G 2/1/1:5 [to_nersc-tb1] (192.124.57.137/30) (2001:400:E100:39::2/127) 2/1/1:5 [to_star-tb1] (192.124.57.138/30) (2001:400:E100:39::3/127) 10/1/1:5 [to_nersc-mr2_ip-a] (192.124.50.25/30) (2001:400:E11E:2::2/12 7) xe-1/2/0:5 (192.124.50.26/30) (2001:400:E11E:2::3/127 ) 10/1/2:5 [to_nersc-mr2_ip-b] (192.124.50.29/30) (2001:400:E11E:8::2/12 7) xe-7/2/0.5 (192.124.50.30/30) (2001:400:E11E:8::3/12 7)

6 Lawrence Berkeley National LaboratoryU.S. Department of Energy | Office of Science ESnet 100G Testbed (AS3432) ESnet 100G Testbed (AS3432) ESnet5 (AS293) ESnet5 (AS293) star-cr5nersc-mr2nersc-tb1 ESnet 100G Testbed Future 2x10G 100G aofa-cr5star-tb1 100G STAR-AOFA Testbed 100G

7 Lawrence Berkeley National LaboratoryU.S. Department of Energy | Office of Science ANA-100G Advanced North Atlantic 100G Pilot (ANA-100G) − Combined effort with Internet2 (USA) NORDUnet, (Nordic countries) ESnet (USA DoE) SURFnet (Netherlands) CANARIE (Canada) GÉANT (Europe) 100G wave from New York to Amsterdam − For prototyping, experiments, etc − 12 month term − Consortium purchased spectrum on submarine link − Lit with consortium-owned cards slotted in provider’s chassis

8 Lawrence Berkeley National LaboratoryU.S. Department of Energy | Office of Science ANA-100G

9 Lawrence Berkeley National LaboratoryU.S. Department of Energy | Office of Science ANA-100G North American terrestrial component Trans-Atlantic submarine component European submarine component Drawing courtesy of Ciena

10 Lawrence Berkeley National LaboratoryU.S. Department of Energy | Office of Science ANA-100G at TNC TNC Demos Big data transfers with multipathing, OpenFlow and MPTCP Visualize 100G traffic How many modern servers can fill a 100Gbps Transatlantic Circuit? First European ExoGENI at Work Up and down North Atlantic @ 100G

11 Lawrence Berkeley National LaboratoryU.S. Department of Energy | Office of Science ANA-100G Futures European side being relocated from Maastricht to Amsterdam Circuit is shared among participating organizations − Likely timesliced for 100G clear-channel use − Also opportunities for multiple 20-40G experiments in parallel Experiments are being planned for next 12 months − E.g. BNL-CERN demo

12 Lawrence Berkeley National LaboratoryU.S. Department of Energy | Office of Science BayExpress Alien Wave Testing Joint project with Juniper PTX router/MPLS platform Single PTX chassis at LBL PTX carved into two logical systems Beta PTX PICs with 100G coherent, colored optics Ixia at LBL for traffic generation / error detection

13 Lawrence Berkeley National LaboratoryU.S. Department of Energy | Office of Science BayExpress Alien Wave Testing 13 No transponder on DWDM system. Colored, tunable 100 GE interface on the router. Transponder on DWDM system Transponder on DWDM system CFP grey optics Long-haul colored optics PTX

14 Lawrence Berkeley National LaboratoryU.S. Department of Energy | Office of Science BayExpress Alien Wave Testing Phase 1 − Alien on 2 node Ciena lab system at LBL Phase 2 − Alien on production BayExpress − 1 segment LBL-NERSC − 12km Phase 3 − Alien on production BayExpress − 1400km (700km each direction)

15 Lawrence Berkeley National LaboratoryU.S. Department of Energy | Office of Science BayExpress Alien Wave Testing Phase 3 Explain “long way around” Total Distance: ~700km 7 ROADMs PTX

16 Lawrence Berkeley National LaboratoryU.S. Department of Energy | Office of Science BayExpress Alien Wave Testing Wrapup Motivations − Explore technology for future architectures Bit of experience with PTX platform Gained experience with router based colored optics Proof of 1000+ km operation − Gain operational experience with alien wavelengths Greater understanding of provisioning steps and parameters Ciena 6500s behaved perfectly »No operational impact »Alien was balanced with existing native waves »BER of existing waves unaffected Possible future testing − Attempt alien wavelength in the backbone?

17 Lawrence Berkeley National LaboratoryU.S. Department of Energy | Office of Science ESnet5 Routed Network (July 2013) Routers 16 Alcatel Lucent (ALU) 7750-SR12 − 10-slot router with up to 200G per slot − 56 100G interfaces & 200+ 10G interfaces 30 existing Juniper MX’s − Used in 10G hubs, commercial exchange points, sites 16 existing Juniper M7i & M10i − For terminating links slower than GE 4 very old Cisco 7206 − For terminating links slower than GE Services Standard routed IP (including full Internet services) Point to Point Dynamic Virtual Circuits using OSCARS Various overlay networks (Private VPNs, LHCONE VRF)

18 Lawrence Berkeley National LaboratoryU.S. Department of Energy | Office of Science SNLL PNNL LANL SNLA JLAB PPPL MIT/ PSFC BNL AMES LLNL GA JGI LBNL SLAC NERSC ORNL ANL FNAL Salt Lake INL GFDL PU Physics SEAT STAR CHIC WASH commercial peering points R&E network peering locations NASH ATLA HOUS NEWY AOFA BOST KANS DENV ALBQ LASV BOIS SACR ELPA SDSC LOSA Routed IP 100 Gb/s Routed IP 4 X 10 Gb/s 3 rd party 10Gb/s Express / metro 100 Gb/s Express / metro 10G Express multi path 10G Lab supplied links Other links Tail circuits Major Office of Science (SC) sites LBNL Major non-SC DOE sites LLNL CLEV ESnet optical transport nodes (only some are shown) MANLAN ESnet managed 10G router 10 ESnet managed 100G routers 100 Site managed routers 100 10 100 10 1 100 10 100 10 100 10 100 10 100 SUNN ESnet PoP/hub locations LOSA ESnet optical node locations (only some are shown) 10 100 EQCH Geography is only representational ESnet5 July 2013 1 100 10 SUNN 100 10 100 CHAT

19 Lawrence Berkeley National LaboratoryU.S. Department of Energy | Office of Science ESnet Optical Network (July 2013) 341 Ciena 6500 Nodes − 57 Add/Drops − 284 Amps ESnet waves deployed − 23 100G − 7 40G (muxes 4x10G client circuits) − 18 10G metro (non-coherent)

20 Lawrence Berkeley National LaboratoryU.S. Department of Energy | Office of Science ESnet Optical Footprint: Add/Drops

21 Lawrence Berkeley National LaboratoryU.S. Department of Energy | Office of Science 100G Testing Review of ESnet process before links are placed into service: Saturation Test − No internal bottlenecks to prevent running at capacity − Meet or exceed 95% of line-rate for 5 minutes Loss Test − 50% of line rate capacity over 24 hour duration − Ensure that line is performing properly (no errors) − Strive for zero loss on all circuits At 100G, transfer 500 TB in <= 24 hours with 0 loss 100G site connections are extended DMZs In general our testing of site 100G connections has involved site equipment interface

22 Lawrence Berkeley National LaboratoryU.S. Department of Energy | Office of Science 100G Testing (cont’d) Have a catalog of issues from our experiences − Internal testing − Community troubleshooting − Site connection acceptance testing Example equipment issues we’ve seen: − 100G interfaces with a max single flow rate of 12G − 100G interfaces with 50G limits Static vlan mapping Dynamic load balancing − 100G interfaces with 92.5 max single flow rate − 100G interfaces with data corruption on jumboframes All equipment has its own set of issues and quirks − Important to understand what you deploy/support

23 Lawrence Berkeley National LaboratoryU.S. Department of Energy | Office of Science 100G Testing (cont’d) We were asked to help with this problem LHC ATLAS issue − “Urgent difficult problem we're seeing here bringing IU to the MWT2 group in Chicago” − Problem: intermittent payload data corruption This is corruption of the payload that's not being caught by normal checksums in the underlying protocols. We worked with IU to isolate the source of the problem by ruling out IU’s Chicago-Indianapolis optical transport IU worked with the network equipment manufacturer to identify the root cause of the problem − Used a similar type of test equipment to prove it − IU looking to buy test equipment

24 Lawrence Berkeley National LaboratoryU.S. Department of Energy | Office of Science 100G Testing (cont’d) Having 100G test equipment we can depend is valuable − Isolated problems quickly − Easy to validate repairs Similar test equipment used turning up the transatlantic link for ANA-100G Debugging these problems can be time-consuming − Pattern corruption problem took weeks (days of dedicated effort) − We don’t have time to waste on debugging the test equipment

25 Lawrence Berkeley National LaboratoryU.S. Department of Energy | Office of Science In Progress and Future Work 100G production connections to: − BNL, FNAL, LBNL, LLNL, NERSC 40G transport into Equinix Ashburn & re-arranging our Washington DC ring to provide diverse backbone connections − Longer term: considering 10G for DC MAN Complete provisioning of diverse fiber laterals and diverse optical nodes at ANL & FNAL

26 Lawrence Berkeley National LaboratoryU.S. Department of Energy | Office of Science In Progress and Future Work (cont’d) 40G transport into Las Vegas and relocate to a Level 3 colo − Enables us to deliver higher BW in the region and redundant connectivity Expand optical capability in Atlanta − Supports ORNL area and possible future JLAB redundancy Ames Lab: upgrade to 10G Continue cleanup and consolidation at the hubs, moving connections from the MX’s to the ALUs

27 Lawrence Berkeley National LaboratoryU.S. Department of Energy | Office of Science OSCARS Service Changes Increasing the number of queues − One benefit of this is to apply some of the suggestions in HNTES Provision a queue for alpha flows Support best effort connectivity service if guaranteed bandwidth circuit fails Migrate to single queue* (avoiding Scavenger queue) and PLP bit to accommodate in-profile/out-of-profile OSCARS guaranteed bandwidth traffic on Juniper routers Support zero-bandwidth best effort circuit requests * On all ESnet5 ALU routers, use of single queue and PLP for OSCARS guaranteed bandwidth circuit have already been implemented

28 Lawrence Berkeley National LaboratoryU.S. Department of Energy | Office of Science QoS Changes Currently Class of Service Queues Service QueueRemarks Network ControlFor network control and management traffic Expedited ForwardingFor in-profile OSCARS guaranteed bandwidth circuit traffic Best EffortFor general IP routing ScavengerFor out-of-profile OSCARS circuit traffic and scavenger IP traffic Service QueueRemarks Network ControlFor network control and management traffic Expedited Forwarding CircuitsFor in-profile/out-of-profile* OSCARS guaranteed bandwidth circuit traffic Best Effort CircuitsFor non-guaranteed OSCARS circuit traffic (i.e. zero-bandwidth circuits) Assured ForwardingFor testing elephant (Alpha) flow isolation and routing (prototype deployment) Best EffortFor general IP routing ScavengerFor scavenger IP traffic Proposed Class of Service Queues (late 3Q2013) * In proposed QoS change, Packet Loss Priority (PLP) bit is used to determine in-profile/out-of-profile OSCARS traffic

29 Questions? Thanks! dorn@es.net http://www.es.net/ http://fasterdata.es.net/


Download ppt "Network Engineering Services Group Update Patrick Dorn, Network Engineer ESnet Network Engineering Group ESCC July 15 2013."

Similar presentations


Ads by Google