FAX PERFORMANCE TIM, Tokyo May 2013. PERFORMANCE TIM, TOKYO, MAY 2013ILIJA VUKOTIC 2  Metrics  Data Coverage  Number of users.

Slides:



Advertisements
Similar presentations
SkimSlimService ENABLING NEW WAYS. Problems of Current Analysis Model 2/18/13ILIJA VUKOTIC 2 Unsustainable in the long run (higher luminosity, no faster.
Advertisements

K.Harrison CERN, 23rd October 2002 HOW TO COMMISSION A NEW CENTRE FOR LHCb PRODUCTION - Overview of LHCb distributed production system - Configuration.
Outline Network related issues and thinking for FAX Cost among sites, who has problems Analytics of FAX meta data, what are the problems  The main object.
Efi.uchicago.edu ci.uchicago.edu FAX update Rob Gardner Computation and Enrico Fermi Institutes University of Chicago Sep 9, 2013.
Efi.uchicago.edu ci.uchicago.edu FAX status report Ilija Vukotic Computation and Enrico Fermi Institutes University of Chicago US ATLAS Computing Integration.
ALICE DATA ACCESS MODEL Outline ALICE data access model - PtP Network Workshop 2  ALICE data model  Some figures.
Wahid Bhimji University of Edinburgh J. Cranshaw, P. van Gemmeren, D. Malon, R. D. Schaffer, and I. Vukotic On behalf of the ATLAS collaboration CHEP 2012.
ALICE data access WLCG data WG revival 4 October 2013.
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.
Tier 3 Data Management, Tier 3 Rucio Caches Doug Benjamin Duke University.
 Optimization and usage of D3PD Ilija Vukotic CAF - PAF 19 April 2011 Lyon.
FAX UPDATE 1 ST JULY Discussion points: FAX failover summary and issues Mailing issues Panda re-brokering to sites using FAX cost and access Issue.
FAX UPDATE 26 TH AUGUST Running issues FAX failover Moving to new AMQ server Informing on endpoint status Monitoring developments Monitoring validation.
Efi.uchicago.edu ci.uchicago.edu ATLAS Experiment Status Run2 Plans Federation Requirements Ilija Vukotic XRootD UCSD San Diego 27 January,
SLAC Experience on Bestman and Xrootd Storage Wei Yang Alex Sim US ATLAS Tier2/Tier3 meeting at Univ. of Chicago Aug 19-20,
Efi.uchicago.edu ci.uchicago.edu Towards FAX usability Rob Gardner, Ilija Vukotic Computation and Enrico Fermi Institutes University of Chicago US ATLAS.
CERN IT Department CH-1211 Genève 23 Switzerland t Monitoring: Tracking your tasks with Task Monitoring PAT eLearning – Module 11 Edward.
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1.
Data Replication and Power Consumption in Data Grids Susan V. Vrbsky, Ming Lei, Karl Smith and Jeff Byrd Department of Computer Science The University.
ROOT and Federated Data Stores What Features We Would Like Fons Rademakers CERN CC-IN2P3, Nov, 2011, Lyon, France.
Efi.uchicago.edu ci.uchicago.edu FAX Dress Rehearsal Status Report Ilija Vukotic on behalf of the atlas-adc-federated-xrootd working group Computation.
PanDA Update Kaushik De Univ. of Texas at Arlington XRootD Workshop, UCSD January 27, 2015.
Analysis in STEP09 at TOKYO Hiroyuki Matsunaga University of Tokyo WLCG STEP'09 Post-Mortem Workshop.
LFC Replication Tests LCG 3D Workshop Barbara Martelli.
Efi.uchicago.edu ci.uchicago.edu Using FAX to test intra-US links Ilija Vukotic on behalf of the atlas-adc-federated-xrootd working group Computing Integration.
Efi.uchicago.edu ci.uchicago.edu FAX status developments performance future Rob Gardner Yang Wei Andrew Hanushevsky Ilija Vukotic.
Factors affecting ANALY_MWT2 performance MWT2 team August 28, 2012.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
Efi.uchicago.edu ci.uchicago.edu Status of the FAX federation Rob Gardner Computation and Enrico Fermi Institutes University of Chicago ATLAS Tier 1 /
SLACFederated Storage Workshop Summary For pre-GDB (Data Access) Meeting 5/13/14 Andrew Hanushevsky SLAC National Accelerator Laboratory.
ATLAS XRootd Demonstrator Doug Benjamin Duke University On behalf of ATLAS.
Efi.uchicago.edu ci.uchicago.edu FAX status report Ilija Vukotic on behalf of the atlas-adc-federated-xrootd working group S&C week Jun 2, 2014.
1 Worker Node Requirements TCO – biggest bang for the buck –Efficiency per $ important (ie cost per unit of work) –Processor speed (faster is not necessarily.
PERFORMANCE AND ANALYSIS WORKFLOW ISSUES US ATLAS Distributed Facility Workshop November 2012, Santa Cruz.
Korea Workshop May GAE CMS Analysis (Example) Michael Thomas (on behalf of the GAE group)
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
ALICE DATA ACCESS MODEL Outline 05/13/2014 ALICE Data Access Model 2  ALICE data access model  Infrastructure and SE monitoring.
PROOF tests at BNL Sergey Panitkin, Robert Petkus, Ofer Rind BNL May 28, 2008 Ann Arbor, MI.
FAX UPDATE 12 TH AUGUST Discussion points: Developments FAX failover monitoring and issues SSB Mailing issues Panda re-brokering to FAX Monitoring.
Efi.uchicago.edu ci.uchicago.edu Data Federation Strategies for ATLAS using XRootD Ilija Vukotic On behalf of the ATLAS Collaboration Computation and Enrico.
Atlas Software Structure Complicated system maintained at CERN – Framework for Monte Carlo and real data (Athena) MC data generation, simulation and reconstruction.
Efi.uchicago.edu ci.uchicago.edu Ramping up FAX and WAN direct access Rob Gardner on behalf of the atlas-adc-federated-xrootd working group Computation.
Stephen Gowdy FNAL 9th Feb 2015CMS Computing Model Simulation 1.
22/10/2007Software Week1 Distributed analysis user feedback (I) Carminati Leonardo Universita’ degli Studi e sezione INFN di Milano.
Efi.uchicago.edu ci.uchicago.edu Storage federations, caches & WMS Rob Gardner Computation and Enrico Fermi Institutes University of Chicago BigPanDA Workshop.
Network integration with PanDA Artem Petrosyan PanDA UTA,
Testing Infrastructure Wahid Bhimji Sam Skipsey Intro: what to test Existing testing frameworks A proposal.
PanDA & Networking Kaushik De Univ. of Texas at Arlington ANSE Workshop, CalTech May 6, 2013.
 IO performance of ATLAS data formats Ilija Vukotic for ATLAS collaboration CHEP October 2010 Taipei.
Data Distribution Performance Hironori Ito Brookhaven National Laboratory.
Efi.uchicago.edu ci.uchicago.edu FAX splinter session Rob Gardner Computation and Enrico Fermi Institutes University of Chicago ATLAS Tier 1 / Tier 2 /
Efi.uchicago.edu ci.uchicago.edu Federating ATLAS storage using XrootD (FAX) Rob Gardner on behalf of the atlas-adc-federated-xrootd working group Computation.
Efi.uchicago.edu ci.uchicago.edu Sharing Network Resources Ilija Vukotic Computation and Enrico Fermi Institutes University of Chicago Federated Storage.
PanDA & Networking Kaushik De Univ. of Texas at Arlington UM July 31, 2013.
Efi.uchicago.edu ci.uchicago.edu FAX status report Ilija Vukotic on behalf of the atlas-adc-federated-xrootd working group Computation and Enrico Fermi.
USATLAS Data Transfer Monitoring Dantong Yu Jay Packard John DeStefano Jason Smith.
Efi.uchicago.edu ci.uchicago.edu Caching FAX accesses Ilija Vukotic ADC TIM - Chicago October 28, 2014.
Lyon Analysis Facility - status & evolution - Renaud Vernet.
Efi.uchicago.edu ci.uchicago.edu FAX splinter session Rob Gardner Computation and Enrico Fermi Institutes University of Chicago ATLAS Tier 1 / Tier 2 /
Dynamic Extension of the INFN Tier-1 on external resources
Atlas IO improvements and Future prospects
Solid State Disks Testing with PROOF
Future of WAN Access in ATLAS
BNL FTS services Hironori Ito.
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
FDR readiness & testing plan
Brookhaven National Laboratory Storage service Group Hironori Ito
IPv6 update Duncan Rand Imperial College London
The LHCb Computing Data Challenge DC06
Presentation transcript:

FAX PERFORMANCE TIM, Tokyo May 2013

PERFORMANCE TIM, TOKYO, MAY 2013ILIJA VUKOTIC 2  Metrics  Data Coverage  Number of users  Percentage of successful jobs  Total amount of data delivered  Bandwidth usage  Source  Ganglia plots  MonaLisa  FAX Dashboard  HC tests  CostMatrix tests  Special tests using dedicated resources better than 97%, more than 2 replicas mostly UofC, Prague users Latest HC tests >99% ~ 2PB/week

COST MATRIX TIM, TOKYO, MAY 2013ILIJA VUKOTIC 3 destination Rate MB/s BNL-ATLASCERN-PRODDESY-HHINFN-ROMA1LRZ-LMUMWT2RAL-LCG2SWT2_CPBUKI-LT2-QMULUKI-SCOTGRID-GLASGOW source AGLT BNL-ATLAS CERN-PROD DESY-HH IllinoisHEP INFN-FRASCATI INFN-NAPOLI-ATLAS INFN-ROMA LRZ-LMU MPPMU MWT OU_OCHEP_SWT praguelcg RAL-LCG RU-Protvino-IHEP SWT2_CPB UKI-LT2-QMUL UKI-SCOTGRID-ECDF UKI-SCOTGRID-GLASGOW UKI-SOUTHGRID-OX-HEP WT A place to get idea on rate a single job can expect to see. Are our pipes really this full? Let’s see other sources of information.

COST MATRIX VS. PERFSONAR TIM, TOKYO, MAY 2013ILIJA VUKOTIC 4 Comparison of just one link in one direction: source AGLT destination MWT2 Perfsonar info at 4 h intervals. Can it be worker nodes links are saturating?

MWT2 SLAC AGLT2 BNL CERN CLOGGING THE PIPES  Using HC submitted jobs submitted to 4 ANALY queues  AGLT2, BNL, MWT2, SLAC  Each site runs 300 jobs of two types – 50 in parallel  xrdcp 3 files randomly chosen from SMWZ datasets prepared for FDR from others  Reads 10% of events from 3 file randomly chosen from FDR SMWZ from others  Uploads time to finish, events/s, MB/s for each job, pandaid so jobs can be investigated  All jobs submitted through FDR web interface  All in parallel to other HC stress tests TIM, TOKYO, MAY 2013ILIJA VUKOTIC 5

TESTS 0.17% failure rate ! TIM, TOKYO, MAY 2013ILIJA VUKOTIC 6

COPY  Clearly not limited by WN links  Assuming just 30 simultaneous jobs worst case delivery rates are:  BNL to CERN: 75 MB/s  CERN to AGLT2: 170 MB/s  MWT2 to AGLT2: 100 MB/s  AGLT to CERN: 90 MB/s  SLAC to BNL: 300 MB/s  Average WAN access ~ 300 MB/s TIM, TOKYO, MAY 2013ILIJA VUKOTIC 7 MB/s BNL-ATLASCERN-PRODMWT2AGLT2SLAC source BNL-ATLAS CERN-PROD MWT AGLT SLAC

READ  Jobs were reading 10% of events using TTC 30MB  100% data are transferred and decompressed.  ROOT can decompress our D3PD at ~20 MB/s  Rates are the same as for xrdcp except when local access.  Over WAN one should expect at least 50% of CPU efficiency of local access.  Less than 100 simultaneous standard analysis jobs will saturate 10 Gb WAN link.  FAX needs to be used judiciously, can easily overwhelm weaker links  Rates are the same as for xrdcp except when local access.  Over WAN one should expect at least 50% of CPU efficiency of local access.  Less than 100 simultaneous standard analysis jobs will saturate 10 Gb WAN link.  FAX needs to be used judiciously, can easily overwhelm weaker links TIM, TOKYO, MAY 2013ILIJA VUKOTIC 8 READ destination events/s BNL-ATLASCERN-PRODMWT2AGLT2SLAC source BNL-ATLAS CERN-PROD MWT AGLT SLAC

MONA LISA TIM, TOKYO, MAY 2013ILIJA VUKOTIC 9

WAYS AHEAD TIM, TOKYO, MAY 2013ILIJA VUKOTIC 10  Increase coverage, add redundancy, increase total bandwidth  Enlargement  Increases performance, reduces bandwidth needs  Caching  Cost matrix – smart FAX  Smart network - Bandwidth requests, QOS assurance  Improve adoption rate  Presenting, teaching, preaching  New services  Improve satisfaction  FAX tuning  Application tuning  New services