Presentation is loading. Please wait.

Presentation is loading. Please wait.

Network Monitoring, WAN Performance Analysis, & Data Circuit Support at Fermilab Phil DeMar US-CMS Tier-3 Meeting Fermilab October 23, 2008.

Similar presentations


Presentation on theme: "Network Monitoring, WAN Performance Analysis, & Data Circuit Support at Fermilab Phil DeMar US-CMS Tier-3 Meeting Fermilab October 23, 2008."— Presentation transcript:

1 Network Monitoring, WAN Performance Analysis, & Data Circuit Support at Fermilab Phil DeMar US-CMS Tier-3 Meeting Fermilab October 23, 2008

2 Active Wide-Area Network Monitoring PerfSONAR: distributed network monitoring infrastructure  Supported by US-LHC T1 sites and Internet2 community PerfSONAR-PS: Active monitoring package  Web services collection built on trusted monitoring tools: ping, BWCTL(iperf), owamp, NPAD, NDT toolkit Web service interface for pulling data into other monitoring tools  Zero configuration; out of box deployment Based on Knoppix Live CD bootable disk Optional software bundle deployment  Modest hardware requirements for on-site deployment

3 PerfSONAR Deployment Status US-Atlas moving ahead with perfSonar-PS at T1 & T2s:  Two dedicated systems per site; one each for latency & b/w testing  Systems are spec’ed devices, $628 each (Koi computer)  Utilize Knoppix disks & standard configurations We’ve recommended the same model for US-CMS Current PerfSONAR-PS deployment:  Both US-LHC Tier-1s (FNAL & BNL)  UNL (CMS), U-Mich (ATLAS); U-Delaware; Internet-2; ESnet  Complete active monitoring matrix of the above

4 Background information PerfSONAR-PS project - http://code.google.com/p/perfsonar-ps/ http://code.google.com/p/perfsonar-ps/ Tour of perfSONAR-PS service is available - http://code.google.com/p/perfsonar-ps/wiki/CodeTour http://code.google.com/p/perfsonar-ps/wiki/CodeTour Knoppix Live CD bootable disk info - http://code.google.com/p/perfsonar-ps/wiki/NPToolkit http://code.google.com/p/perfsonar-ps/wiki/NPToolkit Appliance PCs:  Vendor: KOI Computing – (630) 627-8811  Spec:1U Intel Pentium Dual-Core E2200 2.2GHz System  Cost:$628/each

5 Performance Analysis Support In 1999, Matt Mathis coined the term ‘Wizard’s Gap’  Today, it’s still an issue Users often don’t know about:  Common OS tuning issues for WAN data movement  Wide-area network path, its characteristics, available tools Its still an end-to-end problem  And the world is still short on wizards Our structured analysis methodology seeks to put some of the wizardry into structured process

6 Find the performance problem area(s)

7 Performance Analysis Methodology Structured approach to performance analysis Model the process like medical diagnosis  Collect the physical characteristics  Run diagnostic tests  Record everything; develop a history of the analysis Strategic approach:  Sub-divide problem space: Application-related problems Host diagnosis and tuning Network path analysis  Then divide and conquer

8 Network Performance Analysis Architecture PTDS

9 Host diagnosis  Script that pulls system configuration  Network Diagnostic Tool (NDT) Faulty network connections & NICs, duplex mismatches Network path diagnosis  OWAMP to collect and diagnose one-way network path statistics. Packet loss, latency, jitter  Other tools such as ping, traceroute, as needed Packet trace diagnosis  Port mirror on border router(s)  Tcpdump to collect packet traces  Tcptrace to analyze packet traces  Xplot for visual examination. Performance Analysis Tools…

10 Round-trip time Sequence of routers along the paths One-way delay, delay variance One-way packet drop rate Packet reordering Network path characteristics collected

11 Step 1: Definition of the problem space Step 2: Collect host information & network path characteristics Step 3: Host tuning & diagnosis Step 4: Network path performance analysis  Route changes frequently?  Network congestion: delay variance large?  Infrastructure failures: examine the counter one by one  Packet reordering: load balancing? Parallel processing? Step 5: Evaluate packet trace pattern Network Performance Analysis Methodology

12 Tier2/Tier3 Sites worked with UERJ (Brazil) IHEP (China) RAL (UK) University of Florida IFCA (Spain) TTU (Texas) CIEMAT (Spain) Belgium OWEA (Austria) CSCS (Swiss)

13 An available service for CMS Tier-2/3 sites  A work-in-progress at this point  Focus is on process as well as results  Willing to work with others in this area Future areas of effort:  Incorporate into work flow & content management system  Make use of perfSonar monitoring infrastructure https://plone3.fnal.gov/P0/WAN/netperf/methodology/ How to get hold of us:  Send email to WAN@FNAL.GOVWAN@FNAL.GOV  Wide Area Work Group video-conf meetings every other Friday Performance Analysis Status & Summary

14 Strategic Direction Toward Circuits DOE High Performance Network Planning Workshop established a strategic model to follow:  High bandwidth backbones for reliable production IP service ESnet  Separate high-bandwidth network paths for large scale science data flows Science Data Network  Metropolitan Area Networks (MAN) for local access Fermi LightPath a cornerstone for Chicago area MAN

15 ESnet4: Core networks 50-60 Gbps by 2009-2010 (10Gb/s circuits) Cleveland Europe (GEANT) Asia-Pacific New York Chicago Washington DC Atlanta CERN (30+ Gbps) Seattle Albuquerque Australia San Diego LA Denver South America (AMPATH) South America (AMPATH) Canada (CANARIE) CERN (30+ Gbps) Canada (CANARIE) Asia- Pacific Asia Pacific GLORIAD (Russia and China) Boise Houston Jacksonville Tulsa Boston Science Data Network Core IP Core Kansas City Australia Sunnyvale Production IP core (10Gbps) SDN core (20-30-40-50 Gbps) MANs (20-60 Gbps) or backbone loops for site access International connections USLHCNet

16 Topology of circuit connections Circuits utilize MAN infrastructure:  10GE channel(s) reserved for routed IP service (purple)  LHCOPN circuit (orange) to CERN  SDN channels for E2E circuits to CMS Tier-2/3 (shades of green) Circuits based on end-to-end vLANs  Direct BGP peering with remote site Multiple provider domains is the norm  Deployed technology varies by domains involved  Complexity is higher than IP service

17 FNAL Alternate Path Circuits Supported since 2004 Serve a wide spectrum of experiments  CMS Tier-2s are heavy users Implemented on multiple technologies  But based on end-to-end layer-2 paths Usefulness has varied

18 E2E Circuit Summary FNAL currently supporting E2E circuits to Tier0 & Tier2s  A few Tier3s Today, circuits are largely static configurations Dynamic circuit services are becoming available  Driven largely by Internet2 DCN services Alternate path support services also emerging  Lambda Station (FNAL)  TeraPaths (BNL)  Contact WAN@FNAL.GOV for help or informationWAN@FNAL.GOV


Download ppt "Network Monitoring, WAN Performance Analysis, & Data Circuit Support at Fermilab Phil DeMar US-CMS Tier-3 Meeting Fermilab October 23, 2008."

Similar presentations


Ads by Google