ATLAS federated xrootd monitoring requirements Rob Gardner July 26, 2012.

Slides:

Advertisements

Similar presentations

4.1.5 System Management Background What is in System Management Resource control and scheduling Booting, reconfiguration, defining limits for resource.

Advertisements

Network Traffic Measurement and Modeling CSCI 780, Fall 2005.

Peer-to-peer archival data trading Brian Cooper and Hector Garcia-Molina Stanford University.

Outline Network related issues and thinking for FAX Cost among sites, who has problems Analytics of FAX meta data, what are the problems  The main object.

Efi.uchicago.edu ci.uchicago.edu FAX update Rob Gardner Computation and Enrico Fermi Institutes University of Chicago Sep 9, 2013.

Efi.uchicago.edu ci.uchicago.edu FAX status report Ilija Vukotic Computation and Enrico Fermi Institutes University of Chicago US ATLAS Computing Integration.

LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.

Performance Appraisal Application (PAA). Rating Cycle Rating Official reviews and approves the Performance Plan Rating Official Transfers to Higher Level.

ALICE DATA ACCESS MODEL Outline ALICE data access model - PtP Network Workshop 2  ALICE data model  Some figures.

Module 18 Monitoring SQL Server 2008 R2. Module Overview Monitoring Activity Capturing and Managing Performance Data Analyzing Collected Performance Data.

With Microsoft Windows 7© 2012 Pearson Education, Inc. Publishing as Prentice Hall1 PowerPoint Presentation to Accompany GO! with Microsoft ® Windows 7.

Reading Report 14 Yin Chen 14 Apr 2004 Reference: Internet Service Performance: Data Analysis and Visualization, Cross-Industry Working Team, July, 2000.

CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.

علیرضا فراهانی استاد درس: جعفری نژاد مهر Version Control ▪Version control is a system that records changes to a file or set of files over time so.

ALICE data access WLCG data WG revival 4 October 2013.

ATLAS Off-Grid sites (Tier-3) monitoring A. Petrosyan on behalf of the ATLAS collaboration GRID’2012, , JINR, Dubna.

CERN IT Department CH-1211 Geneva 23 Switzerland t The Experiment Dashboard ISGC th April 2008 Pablo Saiz, Julia Andreeva, Benjamin.

Integration Program Update Rob Gardner US ATLAS Tier 3 Workshop OSG All LIGO.

Module 10: Monitoring ISA Server Overview Monitoring Overview Configuring Alerts Configuring Session Monitoring Configuring Logging Configuring.

Tier 3 Data Management, Tier 3 Rucio Caches Doug Benjamin Duke University.

Storage Wahid Bhimji DPM Collaboration : Tasks. Xrootd: Status; Using for Tier2 reading from “Tier3”; Server data mining.

FAX UPDATE 1 ST JULY Discussion points: FAX failover summary and issues Mailing issues Panda re-brokering to sites using FAX cost and access Issue.

FAX UPDATE 26 TH AUGUST Running issues FAX failover Moving to new AMQ server Informing on endpoint status Monitoring developments Monitoring validation.

Event Data History David Adams BNL Atlas Software Week December 2001.

Efi.uchicago.edu ci.uchicago.edu Towards FAX usability Rob Gardner, Ilija Vukotic Computation and Enrico Fermi Institutes University of Chicago US ATLAS.

Efi.uchicago.edu ci.uchicago.edu FAX meeting intro and news Rob Gardner Computation and Enrico Fermi Institutes University of Chicago ATLAS Federated Xrootd.

And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR

Status & Plan of the Xrootd Federation Wei Yang 13/19/12 US ATLAS Computing Facility Meeting at 2012 OSG AHM, University of Nebraska, Lincoln.

PanDA Update Kaushik De Univ. of Texas at Arlington XRootD Workshop, UCSD January 27, 2015.

What is SAM-Grid? Job Handling Data Handling Monitoring and Information.

Efi.uchicago.edu ci.uchicago.edu FAX status developments performance future Rob Gardner Yang Wei Andrew Hanushevsky Ilija Vukotic.

PPDG update l We want to join PPDG l They want PHENIX to join NSF also wants this l Issue is to identify our goals/projects Ingredients: What we need/want.

Factors affecting ANALY_MWT2 performance MWT2 team August 28, 2012.

Julia Andreeva, CERN IT-ES GDB Every experiment does evaluation of the site status and experiment activities at the site As a rule the state.

David Adams ATLAS DIAL: Distributed Interactive Analysis of Large datasets David Adams BNL August 5, 2002 BNL OMEGA talk.

EGEE-III INFSO-RI Enabling Grids for E-sciencE Ricardo Rocha CERN (IT/GS) EGEE’08, September 2008, Istanbul, TURKEY Experiment.

SLACFederated Storage Workshop Summary For pre-GDB (Data Access) Meeting 5/13/14 Andrew Hanushevsky SLAC National Accelerator Laboratory.

PanDA Status Report Kaushik De Univ. of Texas at Arlington ANSE Meeting, Nashville May 13, 2014.

XROOTD AND FEDERATED STORAGE MONITORING CURRENT STATUS AND ISSUES A.Petrosyan, D.Oleynik, J.Andreeva Creating federated data stores for the LHC CC-IN2P3,

EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Monitoring of the LHC Computing Activities Key Results from the Services.

FAX UPDATE 12 TH AUGUST Discussion points: Developments FAX failover monitoring and issues SSB Mailing issues Panda re-brokering to FAX Monitoring.

Global ADC Job Monitoring Laura Sargsyan (YerPhI).

FTS monitoring work WLCG service reliability workshop November 2007 Alexander Uzhinskiy Andrey Nechaevskiy.

Efi.uchicago.edu ci.uchicago.edu Data Federation Strategies for ATLAS using XRootD Ilija Vukotic On behalf of the ATLAS Collaboration Computation and Enrico.

1 Andrea Sciabà CERN The commissioning of CMS computing centres in the WLCG Grid ACAT November 2008 Erice, Italy Andrea Sciabà S. Belforte, A.

Efi.uchicago.edu ci.uchicago.edu Storage federations, caches & WMS Rob Gardner Computation and Enrico Fermi Institutes University of Chicago BigPanDA Workshop.

Open Science Grid OSG Resource and Service Validation and WLCG SAM Interoperability Rob Quick With Content from Arvind Gopu, James Casey, Ian Neilson,

CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.

CERN - IT Department CH-1211 Genève 23 Switzerland t Grid Reliability Pablo Saiz On behalf of the Dashboard team: J. Andreeva, C. Cirstoiu,

ATLAS Off-Grid sites (Tier-3) monitoring A. Petrosyan on behalf of the ATLAS collaboration GRID’2012, , JINR, Dubna.

Monitoring the Readiness and Utilization of the Distributed CMS Computing Facilities XVIII International Conference on Computing in High Energy and Nuclear.

Cloud computing and federated storage Doug Benjamin Duke University.

XRootD Monitoring Report A.Beche D.Giordano. Outlines  Talk 1: XRootD Monitoring Dashboard  Context  Dataflow and deployment model  Database: storage.

Global Digital Mammography X-ray Machine Industry 2016 Market Research Report Published on – 31 March, 2016 | Number of pages : 153 Single User Price:

1 Netflow Collection and Aggregation in the AT&T Common Backbone Carsten Lund.

ETERE A Cloud Archive System. Cloud Goals Create a distributed repository of AV content Allows distributed users to access.

Improve query performance with the new SQL Server 2016 query store!! Michelle Gutzait Principal Consultant at

Efi.uchicago.edu ci.uchicago.edu FAX splinter session Rob Gardner Computation and Enrico Fermi Institutes University of Chicago ATLAS Tier 1 / Tier 2 /

Efi.uchicago.edu ci.uchicago.edu Federating ATLAS storage using XrootD (FAX) Rob Gardner on behalf of the atlas-adc-federated-xrootd working group Computation.

Efi.uchicago.edu ci.uchicago.edu Sharing Network Resources Ilija Vukotic Computation and Enrico Fermi Institutes University of Chicago Federated Storage.

Efi.uchicago.edu ci.uchicago.edu FAX status report Ilija Vukotic on behalf of the atlas-adc-federated-xrootd working group Computation and Enrico Fermi.

Availability of ALICE Grid resources in Germany Kilian Schwarz GSI Darmstadt ALICE Offline Week.

Efi.uchicago.edu ci.uchicago.edu FAX splinter session Rob Gardner Computation and Enrico Fermi Institutes University of Chicago ATLAS Tier 1 / Tier 2 /

Daniele Bonacorsi Andrea Sciabà

Global Data Access – View from the Tier 2

ALICE Monitoring

Data Federation with Xrootd Wei Yang US ATLAS Computing Facility meeting Southern Methodist University, Oct 11-12, 2011.

FDR readiness & testing plan

Monitoring Of XRootD Federation

Ákos Frohner EGEE'08 September 2008

Presentation transcript:

ATLAS federated xrootd monitoring requirements Rob Gardner July 26, 2012

We will need to iterate as we better understand how the federation is used Extend previous discussions from July 2011 Attempt here to capture requirements from previous discussions – And input from Ilija Vukotic and Torre Wenaus (thanks!) – Will need to formalize within ATLAS so this is highly preliminary

Recall July 2011 discussion Site level metrics identified – WAN direct read access related: MB/s read # remote connections – File caching related (FRM e.g.) MB/s into a site # success and # failed transfers # active movers Aggregate locally and publishing to central collector for federation-level display We will need to extend this for the production infrastructure

General federation monitoring wishes Site-level metrics as well as aggregate, federation-level metrics useful for assessing both functional status and performance Redirection statistics: fraction of time accesses are local, redirected within a region, cloud or global From a job and data management systems perspective we’ll need deeper information to generate a profile for federated access patterns to a specific site, and collections of sites

Capturing a list Site availability and redirection functionality – SSB-like (Jarka) and SLS (central services) Aggregate federation IO accesses and by site – With time histories – Redirection rates – Authentication successes/failures Global, cloud and regional aggregate summaries for federated IO Number of files opened – Distinguish direct access versus copy – Distinguish local versus WAN IO rates – Distinguish direct access versus copy (hard!) – Distinguish local versus WAN (also hard!)

List, cont Statistics for files actually used and mode of access User statistics for direct access versus copy Viewable as current, real-time snapshots and as archival for time histories For brokerage, link with Ilija’s cost matrix – “click down” to get the story behind the cost

List, cont plots ranking sites by data (file counts, byte counts, user counts) served/consumed (copy and direct) plots ranking sites by availability, reliability file lifetime distributions by site "active" data volume at a site, absolute and as fraction of capacity, where "active" file is one used in the last X weeks/months fraction of file opens that find a copy local to the site vs. having to open/retrieve a remote copy (redirection statistics) plot of file age at deletion (cleanup), and plot of avg file age at deletion by site

Summary In the next weeks we will better formalize something official to work from after consulting more folks within ATLAS ADC In the meantime we can focus on the obvious basic metrics