CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/i t EGEE09 Barcelona ATLAS Distributed Data Management Fernando H. Barreiro Megino on behalf.

Slides:



Advertisements
Similar presentations
31/03/00 CMS(UK)Glenn Patrick What is the CMS(UK) Data Model? Assume that CMS software is available at every UK institute connected by some infrastructure.
Advertisements

 Contributing >30% of throughput to ATLAS and CMS in Worldwide LHC Computing Grid  Reliant on production and advanced networking from ESNET, LHCNET and.
T1 at LBL/NERSC/OAK RIDGE General principles. RAW data flow T0 disk buffer DAQ & HLT CERN Tape AliEn FC Raw data Condition & Calibration & data DB disk.
Enabling Grids for E-sciencE Analysis of the ATLAS Rome Production Experience on the LCG Computing Grid Simone Campana, CERN/INFN EGEE.
December Pre-GDB meeting1 CCRC08-1 ATLAS’ plans and intentions Kors Bos NIKHEF, Amsterdam.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
December 17th 2008RAL PPD Computing Christmas Lectures 11 ATLAS Distributed Computing Stephen Burke RAL.
Zhiling Chen (IPP-ETHZ) Doktorandenseminar June, 4 th, 2009.
ATLAS DQ2 Deletion Service D.A. Oleynik, A.S. Petrosyan, V. Garonne, S. Campana (on behalf of the ATLAS Collaboration)
CERN IT Department CH-1211 Geneva 23 Switzerland t The Experiment Dashboard ISGC th April 2008 Pablo Saiz, Julia Andreeva, Benjamin.
3 Sept 2001F HARRIS CHEP, Beijing 1 Moving the LHCb Monte Carlo production system to the GRID D.Galli,U.Marconi,V.Vagnoni INFN Bologna N Brook Bristol.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services GS group meeting Monitoring and Dashboards section Activity.
Alexei Klimentov : ATLAS Computing CHEP March Prague Reprocessing LHC beam and cosmic ray data with the ATLAS distributed Production System.
A short introduction to the Worldwide LHC Computing Grid Maarten Litmaath (CERN)
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
LHC: ATLAS Experiment meeting “Conditions” data challenge Elizabeth Gallas - Oxford - August 29, 2009 XLDB3.
The first year of LHC physics analysis using the GRID: Prospects from ATLAS Davide Costanzo University of Sheffield
F. Fassi, S. Cabrera, R. Vives, S. González de la Hoz, Á. Fernández, J. Sánchez, L. March, J. Salt, A. Lamas IFIC-CSIC-UV, Valencia, Spain Third EELA conference,
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
Tier-2  Data Analysis  MC simulation  Import data from Tier-1 and export MC data CMS GRID COMPUTING AT THE SPANISH TIER-1 AND TIER-2 SITES P. Garcia-Abia.
Metadata requirements for HEP Paul Millar. Slide 2 12 September 2007 Metadata requirements for HEP Some of the players in this game... WLCG – Umbrella.
DDM Monitoring David Cameron Pedro Salgado Ricardo Rocha.
CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Ricardo Rocha CERN (IT/GS) EGEE’08, September 2008, Istanbul, TURKEY Experiment.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Tiers and GRID computing 김 민 석 ( 성균관대 )
CERN IT Department CH-1211 Genève 23 Switzerland t Experiment Operations Simone Campana.
The ATLAS Computing Model and USATLAS Tier-2/Tier-3 Meeting Shawn McKee University of Michigan Joint Techs, FNAL July 16 th, 2007.
1 Andrea Sciabà CERN The commissioning of CMS computing centres in the WLCG Grid ACAT November 2008 Erice, Italy Andrea Sciabà S. Belforte, A.
Victoria, Sept WLCG Collaboration Workshop1 ATLAS Dress Rehersals Kors Bos NIKHEF, Amsterdam.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
The ATLAS Computing & Analysis Model Roger Jones Lancaster University ATLAS UK 06 IPPP, 20/9/2006.
Distributed Analysis Tutorial Dietrich Liko. Overview  Three grid flavors in ATLAS EGEE OSG Nordugrid  Distributed Analysis Activities GANGA/LCG PANDA/OSG.
Finding Data in ATLAS. May 22, 2009Jack Cranshaw (ANL)2 Starting Point Questions What is the latest reprocessing of cosmics? Are there are any AOD produced.
Dynamic Data Placement: the ATLAS model Simone Campana (IT-SDC)
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Response of the ATLAS Spanish Tier2 for.
CERN - IT Department CH-1211 Genève 23 Switzerland t Grid Reliability Pablo Saiz On behalf of the Dashboard team: J. Andreeva, C. Cirstoiu,
ATLAS Distributed Analysis DISTRIBUTED ANALYSIS JOBS WITH THE ATLAS PRODUCTION SYSTEM S. González D. Liko
28 Nov 2007 Alessandro Di Girolamo 1 A “Hands On” overview of the ATLAS Distributed Data Management Disclaimer & Special Thanks Things are changing (of.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
WLCG November Plan for shutdown and 2009 data-taking Kors Bos.
CERN IT Department CH-1211 Genève 23 Switzerland t L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS Simone Campana CERN IT/GS.
ATLAS Computing Model Ghita Rahal CC-IN2P3 Tutorial Atlas CC, Lyon
ATLAS – statements of interest (1) A degree of hierarchy between the different computing facilities, with distinct roles at each level –Event filter Online.
THE ATLAS COMPUTING MODEL Sahal Yacoob UKZN On behalf of the ATLAS collaboration.
ATLAS Distributed Analysis S. González de la Hoz 1, D. Liko 2, L. March 1 1 IFIC – Valencia 2 CERN.
University of Texas At Arlington Louisiana Tech University
The ATLAS “DQ2 Accounting and Storage Usage Service”
Database Replication and Monitoring
ATLAS Use and Experience of FTS
Moving the LHCb Monte Carlo production system to the GRID
Key Activities. MND sections
Jan 12, 2005 Improving CMS data transfers among its distributed Computing Facilities N. Magini CERN IT-ES-VOS, Geneva, Switzerland J. Flix Port d'Informació.
Data Challenge with the Grid in ATLAS
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
Readiness of ATLAS Computing - A personal view
Zhongliang Ren 12 June 2006 WLCG Tier2 Workshop at CERN
Ákos Frohner EGEE'08 September 2008
An introduction to the ATLAS Computing Model Alessandro De Salvo
Cloud Computing R&D Proposal
N. De Filippis - LLR-Ecole Polytechnique
R. Graciani for LHCb Mumbay, Feb 2006
ExaO: Software Defined Data Distribution for Exascale Sciences
LHC Data Analysis using a worldwide computing grid
ATLAS DC2 & Continuous production
The ATLAS Computing Model
LHCb thinking on Regional Centres and Related activities (GRIDs)
The LHCb Computing Data Challenge DC06
Presentation transcript:

CERN IT Department CH-1211 Genève 23 Switzerland t EGEE09 Barcelona ATLAS Distributed Data Management Fernando H. Barreiro Megino on behalf of the ATLAS DDM team

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Overview 1.Introduction 2.ATLAS DDM Architecture 3.ATLAS DDM and the users 4.Summary

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Necessity of Data Management in ATLAS Presently Expected during data taking  Generation of RAW data  Processing and reprocessing  Simulation production … STEP09: STEP09: Stress test involving all key elements from data-taking to analysis

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services ATLAS DDM responsibilities Central link between WLCG and ATLAS analysis components Manage the experiment’s data: –Data movement between associated sites –Bookkeeping & accounting –Data access to Production systems Physics meta-data systems Analysis systems End users …following ATLAS’ Computing model

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services ATLAS Computing Model: Hierarchical Grid Organization Tier-0 facility (CERN): –Archival and distribution of primary RAW detector data –First pass processing of the primary event stream –Distribute the derived data to the Tier-1s 10 Tier-1 data centers: –Long-term access to all data –Reprocessing capacity ~100 Tier-2 institutes: Analysis capacity for users & physics-groups Monte-Carlo simulation RAW ESD AOD RAW ESD AOD CASTOR Event filter Reconstruction RAW ESD, AOD Tier 0 RAW ESD AOD MC RAW ESD AOD MC Analysis Re-Reconstruction ESD, AOD RAW ESD AOD RAW ESD AOD Monte Carlo Analysis MC ESD, AOD RAW, ESD, AOD RAW Tier 1 Tier 2 RAW, ESD, AOD

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services ATLAS Computing Model – WLCG Structure Cloud model 3 different Grid flavors with different middleware: –Open Science Grid (OSG): US –NorduGrid (NDGF): Nordic European countries –EGEE: Europe and rest of the world

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services DDM Overview OSG EGEE NorduGrid

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services DDM Bookkeeping – The Central Catalogues Datasets: Collection of files acting as transfer and organization units Subscriptions: Replication requests of ATLAS datasets Dataset catalogues (Oracle) RepositoryContentLocationSubscription DQ2 catalog API (Apache+mod_python) Container What files are in every dataset? Where is the dataset located? What subscription requests exist in the system? What is the hierarchical organization of the datasets? What datasets exist in the system?

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services DDM Data Movement – Site Services simplified schema Fetch Agent FTS LFC SS DB Subscription Catalog Submit Agent Register Agent Callback Agent Poll Agent Subscriptions LFCs Content Catalog Dataset content Update subscriptions, sources & dataset content Sources for files Files to transfer Transfer jobs Transfer status Transfer jobs & states Files to register Files to register Status of transfers, registrations and subscriptions HTTP callbacks

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Monitoring for: –Shifters: 24/7 follow-up of DDM activity –Site/Cloud operators: Overview of site/cloud activity –VO Managers: Overview of the whole activity –End users: State of subscriptions DDM Monitoring – The ATLAS DDM Dashboard

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services DDM and the users User access to data is chaotic by nature Behavior can not be foreseen Computing model: Users should send their jobs to the data Users want their data immediately: –dq2-get&direct storage access vs. SS subscriptions –dq2-get abuse can lead to –Storage Element overload –Network congestion –DDM service degradation MB/serrors/hour 10 th September First beam day Single user brings down a site

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services DDM and the users – Current work Not all data will be accessed with equal regularity: –Tackle user behavior and anticipate future workload by tracing the user’s operations –Predict workload using ARMA models –Distribute important data before it is needed: HOTDISK spacetoken Sites are responsible to replicate files in this spacetoken to different pools

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services DDM and the users - Accounting

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Summary and conclusions ATLAS DDM is successfully handling all experiment data since 2005 Major challenges passed successfully: ATLAS DDM is capable of handling far more subscriptions than defined in the Computing Model User workload will become the critical factor once we have real data

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services References “Distributed Data Management in the ATLAS Experiment”, Mario Lassnig “Distributed Data Management in ATLAS”, Ricardo Rocha “ATLAS, the Grid and the UK”, Roger Jones “The ATLAS Computing Model”, D. Adams et al. “ATLAS STEP09”, Graeme Stewart

CERN IT Department CH-1211 Genève 23 Switzerland t Backup slides

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Main types of ATLAS data NameDescriptionSize Monte CarloEvent generator output RAWDetector output – Byte stream format1.6 MB/ev Event Summary Data (ESD)Full output of reconstruction – Object format1.0 MB/ev Analysis Object Data (AOD)Summary of reconstruction. Primary analysis data0.2 MB/ev TagThumbnail of each event used for identifying interesting events at the analysis stage 0.01 MB/ev Derived Physics Data (DPD)Skimmed, slimmed, thinned events derived from AODs0.01 MB/ev

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services DDM Deployment Model Central Catalogues and Site Services hosted at CERN LFC and FTS at Tier-1s SRM at every site Site Services Site Services Site Services Central Catalogs

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services CCRC 08 T0->T1s throughput Throughput tests: Burst subscriptions injected every 4 hours and immediately honored Failover tests: 12h backlog fully recovered in 30 minutes All Experiments in the game MB/s

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services STEP 09 Moved 4PB in 2 weeks Traffic rates up to 5.5 GB/s