Presentation is loading. Please wait.

Presentation is loading. Please wait.

“Replica Management in LCG”

Similar presentations


Presentation on theme: "“Replica Management in LCG”"— Presentation transcript:

1 “Replica Management in LCG”
Workshop on Spatiotemporal Databases for Geosciences, Biomedical sciences and Physical sciences “Replica Management in LCG” James Casey, Grid Deployment Group, CERN E-Science Institute, Edinburgh, 2nd November 2005

2 Talk Overview LHC and the Worldwide LCG Project
LHC Data Management Architecture Replica Management Components Storage Catalog Data Movement User APIs & tools CERN Grid Deployment

3 The LHC Experiments CMS ATLAS LHCb CERN Grid Deployment

4 The Atlas Detector The ATLAS collaboration is The ATLAS detector is
~2000 physicists from ~ 150 universities and labs from ~ 34 countries distributed resources remote development The ATLAS detector is 26m long, stands 20m high, weighs 7000 tons has 200 million read-out channels 20 m ~ a 5 story building CERN Grid Deployment

5 equivalent to 2 PetaBytes/sec
Atlas detector Data Acquisition Multi-level trigger Filters out background Reduces data volume Record data 24 hours a day, 7 days a week Equivalent to writing a CD every 2 seconds 40 MHz interaction rate equivalent to 2 PetaBytes/sec Level 1 - Special Hardware Level 2 - Embedded Processors Level 3 – Giant PC Cluster 160 Hz (320 MB/sec) Data Recording & Offline Analysis CERN Grid Deployment

6 Worldwide LCG Project - Rationale
Satisfies the common computing needs of the LHC experiments Need to support 5000 scientists at 500 institutes; Estimated project lifetime: 15 years; Processing requirements: 100,000 CPUs (2004 units); Traditional, centralised approached ruled out in favour of a globally distributed grid for data storage and analysis: Costs of maintaining and upgrading a distributed system more easily handled - individual institutes and organisations can fund local computing resources and retain responsibility for these, while still contributing to the global goal. No single points of failure. Multiple copies of data and automatic reassigning of tasks to available resources ensures optimal use of resources. Spanning all time zones also facilitates round-the-clock monitoring and support. From CERN Grid Deployment

7 LCG Service Deployment Schedule
CERN Grid Deployment

8 (extracted by physics topic)
Data Handling and Computation for Physics Analysis reconstruction detector event filter (selection & reconstruction) analysis processed data event summary data raw data batch physics analysis event reprocessing simulation analysis objects (extracted by physics topic) event simulation interactive physics analysis CERN Grid Deployment

9 WLCG Service Hierarchy
Tier-0 – the accelerator centre Data acquisition & initial processing Long-term data curation Distribution of data  Tier-1 centres Canada – Triumf (Vancouver) France – IN2P3 (Lyon) Germany – Forschungszentrum Karlsruhe Italy – CNAF (Bologna) Netherlands – NIKHEF (Amsterdam) Nordic countries – distributed Tier-1 Spain – PIC (Barcelona) Taiwan – Academia Sinica (Taipei) UK – CLRC (Didcot) US – FermiLab (Illinois) – Brookhaven (NY) Tier-1 – “online” to the data acquisition process  high availability Managed Mass Storage –  grid-enabled data service Data intensive analysis National, regional support Tier-2 – ~100 centres in ~40 countries Simulation End-user analysis – batch and interactive Les Robertson CERN Grid Deployment

10 How much data in one year?
Balloon (30 Km) How much data in one year? CD stack with 1 year LHC data! (~ 20 Km) Storage Space Data produced is ~15PB/year Space provided at all tiers is ~80PB Network bandwidth 70 Gb/s to the big centres Direct dedicated lightpaths to all centres Used only for Tier-0 -> Tier-1 data distribution Number of files ~ 40 million files assuming 2GB files and it runs for 15 years Concorde (15 Km) Mt. Blanc (4.8 Km) CERN Grid Deployment

11 Data Rates to Tier-1s for p-p running
Centre ALICE ATLAS CMS LHCb Rate into T1 (pp) MB/s ASGC, Taipei - 8% 10% 100 CNAF, Italy 7% 13% 11% 200 PIC, Spain 5% 6.5% IN2P3, Lyon 9% 27% GridKA, Germany 20% RAL, UK 3% 15% 150 BNL, USA 22% FNAL, USA 28% TRIUMF, Canada 4% 50 NIKHEF/SARA, NL 23% Nordic Data Grid Facility 6% Totals 1,600 This is 135TB/day of actual data to be distributed These rates must be sustained to tape 24 hours a day, 100 days a year. Extra capacity is required to cater for backlogs / peaks. This is currently our biggest data management challenge. CERN Grid Deployment

12 Problem definition in one line…
“…to distribute, store and manage the high volume of data produced as the result of running the LHC experiments and allow subsequent ‘chaotic’ analysis of the data” Data comprises of Raw data ~90% Processed data ~10% “relational” metadata ~1% “middleware-specific” metadata ~ .001% Main problem is movement of raw data To the Tier-1 sites as an “online” process - volume To the analysis sites – chaotic access pattern We’re really dealing with the non-analysis use cases right now CERN Grid Deployment

13 Replica Management Model
Write-Once/Read-Many files Avoid issue of replica consistency No mastering User accesses data via a logical name Actual filename on storage system is irrelevant No strong authorization on storage itself All users in a VO are considered the same No usage of user identity on MSS Storage uses unix permissions Different users represent different “roles” e.g experiment production managers group == VO Simple user-initiated replication model upload/replicate/download cycle CERN Grid Deployment

14 Replica Management Model
All replicas are considered the same a replica is “close” if in same network domain Explicitly made close to a particular cluster By the information system Or local environment variables This is basically the model inherited from European DataGrid (EDG) Data Management software Although all the software has been replaced! CERN Grid Deployment

15 Replica Management components
Each file has a unique Grid ID. Locations corresponding to the GUID are kept in the Replica Catalog. Users select data via metadata. This is in the Experiment Metadata Catalog. The file transfer service provides reliable asynchronous 3rd party file transfer. Experiment Metadata Catalog Transfer Service Client tools Replica Catalog Blue: WP2 services The client interacts with the grid via the experiment framework, and LCG APIs. Files have replicas stored at many Grid sites on Storage Elements. Storage Element Storage Element CERN Grid Deployment

16 Software Architecture
Layered Architecture Experiments hook in at whatever layer they require Focus on Core Services Experiments integrate into their own replication framework Not possible to provide a generic data management model for all four experiments Provide C/python/perl APIs and some simple CLI tools Data Management model still based on EDG model Suggest change is trying to introduce a better security model But our users don’t really care about it, only the performance penalty it gives them ! CERN Grid Deployment

17 Software Architecture
LCG software model heavily influenced (by) EDG First LCG middleware releases came directly out of the EDG project Globus 2.4 is used as a basic lower layer Gridftp for data movement Globus GSI Security model and httpg for web service security We are heavily involved in EGEE We take components out of the EGEE gLite release and integrate them into the LCG release And we write our own components we need to But should be a very last resort! (LCG Data Management team is ~2FTE) CERN Grid Deployment

18 Layered Data Management APIs
Experiment Framework User Tools Data Management (Replication, Indexing, Querying) lcg_utils Cataloging Storage Data transfer GFAL Component Specific APIs Classic SE File Transfer Service Globus Gridftp EDG LFC SRM CERN Grid Deployment

19 Summary : What can we do? Store the data Find the data Access the data
Managed Grid-accessible storage Including interface to MSS Find the data Experiment metadata catalog Grid replica catalogs Access the data LAN “posix-like” protocols gridftp on the WAN Move the data Asynchronous high bandwidth data movement Throughput more important that latency CERN Grid Deployment

20 Storage Model We must manage storage resources in an unreliable distributed large heterogeneous system We must make the MSS at Tier-0/Tier-1 and the disk based storage appear the same to the users Long lasting data intensive transactions Can’t afford to restart jobs Can’t afford to loose data, especially from experiments Heterogeneity Operating systems MSS - HPSS, Enstore, CASTOR, TSM Disk systems – system attached, network attached, parallel Management Issues Need to manage more storage with less people CERN Grid Deployment

21 Storage Resource Manager (SRM)
Collaboration between LBNL, CERN, FNAL, RAL, Jefferson Lab Became the GGF Grid Storage Management Working Group Provides a common interface to Grid Storage Exposed as a Web Service Negotiable transfer protocols (Gridftp, gsidcap, RFIO, …) We use three different implementations CERN CASTOR SRM – for CASTOR MSS DESY/FNAL dCache SRM LCG DPM – disk only lightweight SRM for Tier-2s CERN Grid Deployment

22 SRM / MSS by Tier1 Centre SRM MSS Tape H/W Canada, TRIUMF dCache TSM
France, CC-IN2P3 HPSS STK Germany, GridKA LTO3 Italy, CNAF CASTOR STK 9940B Netherlands, NIKHEF/SARA DMF Nordic Data Grid Facility DPM N/A Spain, PIC Barcelona Taipei, ASGC UK, RAL ADS CASTOR(?) USA, BNL USA, FNAL ENSTOR CERN Grid Deployment

23 Catalog Model Experiments own and control the metadata catalog
All interaction with grid files is via a GUID (or LFN) obtained from their metadata catalog Two models for tracking replicas Single global replica catalog LHCb Central metadata catalog stores pointers to site local catalogs which contain the replica information ALICE/ATLAS/CMS Different implementations used LHC File Catalog (LFC), Globus RLS, experiment developed catalogs This is a “simple” problem, but we keep revisting it CERN Grid Deployment

24 Accessing the Data Grid File Access Layer (GFAL) lcg_util
originally a low-level I/O interface to Grid Storage provides “posix-like” I/O abstraction Now provides: File Catalog abstraction Information system abstraction Storage Element Abstraction (EDG SE, EDG ‘Classic’ SE, SRM v1) lcg_util Provides a replacement for the EDG Replica Manager Provides both direct C library calls and CLI tools Is a thin wrapper on top of GFAL Has extra experiment requested features compared to the EDG Replica Manager CERN Grid Deployment

25 Managed Transfers gLite File Transfer Service (FTS) is a fabric service It provides point to point movement of SURLs Aims to provide reliable file transfer between sites, and that’s it! Allows sites to control their resource usage Does not do ‘routing’ Does not deal with GUID, LFN, Dataset, Collections Provides Sites with a reliable and manageable way of serving file movement requests from their VOs Users with an async reliable data movement interface VO developers with a pluggable agent-framework to monitor and control the data movement for their VO CERN Grid Deployment

26 Summary LCG will require a large amount of data movement
Production use-cases demand high-bandwidth distribution of data to many sites in a well-known pattern Analysis use cases will provide chaotic, unknown replica access patterns We have a solution for the first problem This is out our main focus Tier-1’s are “online” to the experiment The second is under way The accelerator is nearly upon us And then it’s full service until 2020 ! CERN Grid Deployment

27 Thank you CERN Grid Deployment

28 Backup Slides CERN Grid Deployment

29 Computing Models CERN Grid Deployment

30 CMS Computing Model Overview
CERN Grid Deployment

31 LHCb Computing Model Overview
CERN Grid Deployment

32 Data Replication CERN Grid Deployment

33 Types of Storage in LCG-2
3 “classes” of storage at sites Integration of large (tape) MSS (at Tier 1 etc) – Responsibility of site to make the integration Large Tier 2’s – sites with large disk pools (100s Terabytes, many fileservers), need a flexible system dCache provides a good solution Needs effort to integrate and manage Sites with smaller disk pools (1 – 10 Terabytes), less available management effort Need a lightweight (install, manage) solution LCG Disk Pool Manager is a solution for this problem CERN Grid Deployment

34 Catalogs CERN Grid Deployment

35 Catalog Model Experiment responsibility to keep metadata catalog and replica catalog (either local or global) in sync LCG tools only deal with global case, since each local case is different LFC is able to be used as either a local or global catalog component Workload Management picks sites with replica by querying a global Data Location Interface (DLI) Can be provided by either Experiment metadata catalog Global grid replica catalog (e.g LFC) CERN Grid Deployment

36 LCG File Catalog Provides a filesystem-like view of grid files
Hierarchical namespace and namespace operations Integrated GSI Authentication + Authorization Access Control Lists (Unix Permissions and POSIX ACLs) Fine grained (file level) authorization Checksums User exposed transaction API CERN Grid Deployment


Download ppt "“Replica Management in LCG”"

Similar presentations


Ads by Google