Presentation is loading. Please wait.

Presentation is loading. Please wait.

PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003.

Similar presentations


Presentation on theme: "PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003."— Presentation transcript:

1 PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003

2 AbstractAbstract The vision of science grids allocating resources to analyze huge quantities of HENP data clearly depends on reliable network performance. Tools developed at SLAC in conjunction with the Internet2 PIPES project will help to ensure this. In this talk, these tools will be discussed and the procedure for publishing performance data, in particular using the Globus toolkit's MDS and web services will be reviewed. The subsequent analysis and trouble-shooting methodology will be discussed with real world examples from the particle physics data grid (PPDG) and the European data grid (EDG).

3 OverviewOverview What is the problem ? What is PIPES ? Network performance monitoring Problem identification

4 Resource Broker Farm Farm Farm Data Data Data requestor The Network Network Monitoring for the Grid The Data Grid consists of many components that must interoperate requestor

5 Resource Broker Farm Data Data Data requestor The Network Allocate Resources The resource broker must be fully informed Measurement is required ! requestor 12% pkt loss OC4880% Utilization

6 What is PIPES ? Internet2 End-to-end performance initiative PI Performance Evaluation System (PIPES) PIPES Monitoring Platform (PMP) Overlap with goals of HENP Tremendous resources

7 IEPM-BWIEPM-BW Package developed at SLAC –Measurement Engine Iperf, bbftp, bbcp, ping, traceroute Abwe, owamp, udpmon, gridftp –Job Manager –Data Storage and data server –Analysis Engine

8 SNV SLAC CHI ESnet NY Stanford CalREN NERSC LANL JLAB TRIUMF KEK Abilene SLAC SNV FNAL ANL NIKHEF CERN IN2P3 CERN CALTECH SDSC BNL JAnet HSTN SEA ATL CLV IPLS RAL UCL UManc DL NNW NY Rice UTDallas NCSA UMich I2 SOX UFL APAN RIKEN INFN-Roma INFN-Milan CESnet APAN Geant EDG PPDG/GriPhyN Monitoring Site ORNL Stanford UTAH DNVR ORNL NASA WASH Imperial INFN-Padua

9 SLAC Manchester Bristol Dresden IN2P3 RAL Stanford Calren Abilene Renater DFN Janet NNW TVN SWERN ESnet BaBar Grid Geant 622Mbps 2.5 Gbps 1 Gbps 10 Gbps

10

11 Problem Identification Typical Scenario –User complains file transfer is slow –Net admin runs ping, traceroute, iperf test –Complain to upstream provider Proactive –What do we mean by throughput? –How do we know there was a performance hit? –Our approach is diurnal changes

12

13 Alarms Too much to keep track of Rather not wait for complaints Automated Alarms Rolling average à la RIPE-TT –May not be the best approach AMP Automated Detection System

14

15

16 LimitationsLimitations Could be over an hour before alarm is generated More frequent measurements impact the network and measurements overlap Low impact tools allow finer grained measurement –Use NWS multi-variate method –Use SCIDAC ABwE tool –Use PingER, OWAMP

17

18 PublishingPublishing Many monitoring projects, publish data to allow them to inter-operate MDS –EDG NM Schema Web Services –GLUE NE Schema GGF NMWG –Hierarchy Doc –Tools Doc./get_data 2003 3 18 6 1 41 1.61 1.601 1.62 0

19 Net Rat Alarm System –Multiple tools –Multiple measurement points –Trigger further measurements –Cross reference off site stats Informant database No measurement is ‘authoritative’ –Cannot even believe a measurement

20 LogLog 03/20/2003 20:13:46 ALARM pcgiga throughput=305.224 ctresh=512.95 athresh=312.91 03/20/2003 20:13:48 TRACE no change in route detected 03/20/2003 20:16:07 CALM Throughput within acceptable limits. ALARM CANCELLED

21 Toward a Monitoring Infrastructure MAGGIE –Measurement and Analysis package built on NIMI/Akenti EDEE –production-quality Data Grid for Europe

22 More Information IEPM Home Page IEPM-BW I2 E2E and PIPES RIPE-TT AMP Automated Event Detection NWS ABWE

23 EndEnd This talk made possible by the IEPM team at SLAC (Les Cottrell, Connie Logg, Jiri Navratil, Jerrod Williams, Fabrizio Coccetti), and the many developers and maintainers around the world.


Download ppt "PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003."

Similar presentations


Ads by Google