Presentation is loading. Please wait.

Presentation is loading. Please wait.

Www.egi.eu EGI-InSPIRE RI-261323 EGI-InSPIRE www.egi.eu EGI-InSPIRE RI-261323 Data Management Highlights in TSA3.3 Services for HEP Fernando Barreiro Megino,

Similar presentations


Presentation on theme: "Www.egi.eu EGI-InSPIRE RI-261323 EGI-InSPIRE www.egi.eu EGI-InSPIRE RI-261323 Data Management Highlights in TSA3.3 Services for HEP Fernando Barreiro Megino,"— Presentation transcript:

1 www.egi.eu EGI-InSPIRE RI-261323 EGI-InSPIRE www.egi.eu EGI-InSPIRE RI-261323 Data Management Highlights in TSA3.3 Services for HEP Fernando Barreiro Megino, Domenico Giordano, Maria Girone, Elisa Lanciotti, Daniele Spiga on behalf of CERN-IT-ES-VOS and SA3 EGI Technical Forum – Data management highlights 22.9.2011 1

2 www.egi.eu EGI-InSPIRE RI-261323 Outline Introduction: WLCG today LHCb Accounting Storage Element and File Catalogue consistency ATLAS Distributed Data Management: Breaking cloud boundaries CMS Popularity and Automatic Site Cleaning Conclusions EGI Technical Forum – Data management highlights 22.9.2011 2

3 www.egi.eu EGI-InSPIRE RI-261323 Outline Introduction: WLCG today LHCb Accounting Storage Element and File Catalogue consistency ATLAS Distributed Data Management: Breaking cloud boundaries CMS Popularity and Automatic Site Cleaning Conclusions EGI Technical Forum – Data management highlights 22.9.2011 3

4 www.egi.eu EGI-InSPIRE RI-261323 WLCG today EGI Technical Forum – Data management highlights 22.9.2011 4 4 experiments ALICE ATLAS CMS LHCb Over 140 sites ˜150k CPU cores >50 PB disk Few thousand users O(1M) file transfers/day O(1M) jobs/day

5 www.egi.eu EGI-InSPIRE RI-261323 Outline Introduction: WLCG today LHCb Accounting Storage Element and File Catalogue consistency ATLAS Distributed Data Management: Breaking cloud boundaries CMS Popularity and Automatic Site Cleaning Conclusions EGI Technical Forum – Data management highlights 22.9.2011 5

6 www.egi.eu EGI-InSPIRE RI-261323 LHCb Accounting EGI Technical Forum – Data management highlights 22.9.2011 6 Reports used are currently the main input for clean-up campaigns Agent that on a daily basis generates an accounting report based on the information available on the book-keeping system Metadata breakdown Location Data type Event type File type Display information in dynamic web-page

7 www.egi.eu EGI-InSPIRE RI-261323 Outline Introduction: WLCG today LHCb Accounting Storage Element and File Catalogue consistency ATLAS Distributed Data Management: Breaking cloud boundaries CMS Popularity and Automatic Site Cleaning Conclusions EGI Technical Forum – Data management highlights 22.9.2011 7

8 www.egi.eu EGI-InSPIRE RI-261323 Storage element and file catalogue consistency Grid Storage Elements (SEs) are decoupled from the File Catalogue (FC). Inconsistencies can arise: 1.Dark data: Waste of disk space 1.Dark data: Data in the SEs, but not in the FC. Waste of disk space 2.Lost/corrupted files: Operational problems, e.g. failing jobs 2.Lost/corrupted files: Data in the FC, but not in the SEs. Operational problems, e.g. failing jobs Dark datafull storage dumpsDark data is identified through consistency checks using full storage dumps one common format and procedureNeed one common format and procedure that covers various SEs: DPM, dCache, StoRM and CASTOR three experiments: ATLAS, CMS and LHCb Decision Text format and XML format Required information is: Spacetoken, LFN (or PFN), file size, creation time and checksum The storage dump should be provided on a weekly/monthly basis or on demand EGI Technical Forum – Data management highlights 22.9.2011 8

9 www.egi.eu EGI-InSPIRE RI-261323 Example of good synchronization: LHCb storage usage at CNAF CNAF provides storage dumps daily Checks are done centrally with LHCb Data Management tools Good SE-LFC agreementGood SE-LFC agreement Preliminary results: EGI Technical Forum – Data management highlights 22.9.2011 9 Small discrepancies (O(1TB)) are not a real problem. They can be due to a delay between uploading to the SE and registration to LFC and delay to refresh the information in the LHCb database

10 www.egi.eu EGI-InSPIRE RI-261323 Outline Introduction: WLCG today LHCb Accounting Storage Element and File Catalogue consistency ATLAS Distributed Data Management: Breaking cloud boundaries CMS Popularity and Automatic Site Cleaning Conclusions EGI Technical Forum – Data management highlights 22.9.2011 10

11 www.egi.eu EGI-InSPIRE RI-261323 Original data distribution model Hierarchical tier organization based on Monarc network topology Developed over a decade ago Sites are grouped into clouds for organizational reasons Possible communications: Optical Private Network T0-T1 T1-T1 National networks Intra-cloud T1-T2 Restricted communications: General public network Inter-cloud T1-T2 Inter-cloud T2-T2 But the network capabilities are not the same anymore! Many use-cases require breaking these boundaries! EGI Technical Forum – Data management highlights 22.9.2011 11

12 www.egi.eu EGI-InSPIRE RI-261323 Machinery in place 12 Purpose: Generate full mesh transfer statistics for monitoring, site commissioning and to feed back the system EGI Technical Forum – Data management highlights 22.9.2011

13 www.egi.eu EGI-InSPIRE RI-261323 Consequences Link commissioning –Sites optimizing network connections E.g. UK experience http://tinyurl.com/3p23m2phttp://tinyurl.com/3p23m2p –Revealed different network issues E.g. asymmetric network throughput for various sites (affecting also other experiments) Definition of T2Ds: “Directly connected T2s” Commissioned sites with good network connectivity These sites benefit from closer transfer policies Gradual flattening of the ATLAS Computing Model in order to reduce limitations on –Dynamic data placement –Output collection of multi-cloud analysis Current development of generic, detailed FTS monitor –FTS servers publishing file level information (CERN-IT-GT) –Expose info through generic web interface and API (CERN-IT-ES) EGI Technical Forum – Data management highlights 22.9.2011 13

14 www.egi.eu EGI-InSPIRE RI-261323 Outline Introduction: WLCG today LHCb Accounting Storage Element and File Catalogue consistency ATLAS Distributed Data Management: Breaking cloud boundaries CMS Popularity and Automatic Site Cleaning Conclusions EGI Technical Forum – Data management highlights 22.9.2011 14

15 www.egi.eu EGI-InSPIRE RI-261323 CMS Popularity EGI Technical Forum – Data management highlights 22.9.2011 In order to understand how to manage storage more efficiently, it is important to know what data (i.e. which files) is being accessed most and what are the access patterns 30PB of files 50 sites The CMS Popularity service now tracks the utilization of 30PB of files over more than 50 sites CRAB CMS distributed analysis framework CRAB CMS distributed analysis framework Input files Input Blocks LumiRanges Dashboard DB Pull and translate jobs to file level entities Popularity DB Popularity information Popularity web frontend 15 External systems (e.g. cleaning agent) External systems (e.g. cleaning agent)

16 www.egi.eu EGI-InSPIRE RI-261323 CMS Popularity Monitoring EGI Technical Forum – Data management highlights 22.9.2011 16

17 www.egi.eu EGI-InSPIRE RI-261323 Automatic site cleaning Victor Group pledges & PheDEX Popularity service & PheDEX PheDEX 1. Selection of groups filling their pledge on T2s 2. Selection of unpopular replicas 3. Publication of decisions Used&pledged space Replica popularity Space information Replicas to delete Deleted replicas, Group-site association information Popularity Web Agent running daily on a dedicated machine Project initially developed for ATLAS, now extended for CMS Plug-in architecture Common core Experiment specific plug-ins wrapping their Data Management API calls Project initially developed for ATLAS, now extended for CMS Plug-in architecture Common core Experiment specific plug-ins wrapping their Data Management API calls EGI Technical Forum – Data management highlights 22.9.2011 17 Equally important to know what data is not accessed! Automatic procedures for site clean up

18 www.egi.eu EGI-InSPIRE RI-261323 Outline Introduction: WLCG today LHCb Accounting Storage Element and File Catalogue consistency ATLAS Distributed Data Management: Breaking cloud boundaries CMS Popularity and Automatic Site Cleaning Conclusions EGI Technical Forum – Data management highlights 22.9.2011 18

19 www.egi.eu EGI-InSPIRE RI-261323 Conclusions First 2 years of data taking experiences on the LHC were successful Data volumes and user activity keep increasing We are learning how to operate the infrastructure efficiently Common challenges for all experiments Automate daily operations Optimize the usage of the storage and network resources Evolving computing models Improving data placement strategies 19 EGI Technical Forum – Data management highlights 22.9.2011


Download ppt "Www.egi.eu EGI-InSPIRE RI-261323 EGI-InSPIRE www.egi.eu EGI-InSPIRE RI-261323 Data Management Highlights in TSA3.3 Services for HEP Fernando Barreiro Megino,"

Similar presentations


Ads by Google