Distributed Tier1 scenarios G. Donvito INFN-BARI.

Slides:

Advertisements

Similar presentations

Storage Workshop Summary Wahid Bhimji University Of Edinburgh On behalf all of the participants…

Advertisements

Introduction to CMS computing CMS for summer students 7/7/09 Oliver Gutsche, Fermilab.

EU-GRID Work Program Massimo Sgaravatto – INFN Padova Cristina Vistoli – INFN Cnaf as INFN members of the EU-GRID technical team.

1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.

Business Continuity and DR, A Practical Implementation Mich Talebzadeh, Consultant, Deutsche Bank

Workload Management Workpackage Massimo Sgaravatto INFN Padova.

Workload Management Massimo Sgaravatto INFN Padova.

Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc

PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.

Grid Data Management A network of computers forming prototype grids currently operate across Britain and the rest of the world, working on the data challenges.

Computing for ILC experiment Computing Research Center, KEK Hiroyuki Matsunaga.

Hadoop Hardware Infrastructure considerations ©2013 OpalSoft Big Data.

The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Dataset Caitlin Minteer & Kelly Clynes.

Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.

INTRODUCTION The GRID Data Center at INFN Pisa hosts a big Tier2 for the CMS experiment, together with local usage from other HEP related/not related activities.

What are the main differences and commonalities between the IS and DA systems? How information is transferred between tasks: (i) IS it may be often achieved.

Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.

LCG Service Challenge Phase 4: Piano di attività e impatto sulla infrastruttura di rete 1 Service Challenge Phase 4: Piano di attività e impatto sulla.

1 DIRAC – LHCb MC production system A.Tsaregorodtsev, CPPM, Marseille For the LHCb Data Management team CHEP, La Jolla 25 March 2003.

Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.

9 February 2000CHEP2000 Paper 3681 CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts Hardware and Resources.

November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.

LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1.

1 LCG-France sites contribution to the LHC activities in 2007 A.Tsaregorodtsev, CPPM, Marseille 14 January 2008, LCG-France Direction.

T3 analysis Facility V. Bucard, F.Furano, A.Maier, R.Santana, R. Santinelli T3 Analysis Facility The LHCb Computing Model divides collaboration affiliated.

1 User Analysis Workgroup Discussion  Understand and document analysis models  Best in a way that allows to compare them easily.

Storage issues for end-user analysis Bernd Panzer-Steindel, CERN/IT 08 July

1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.

6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.

ATLAS WAN Requirements at BNL Slides Extracted From Presentation Given By Bruce G. Gibbard 13 December 2004.

7 Strategies for Extracting, Transforming, and Loading.

Scientific Storage at FNAL Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015.

Distributed Data for Science Workflows Data Architecture Progress Report December 2008.

Evolution of storage and data management Ian Bird GDB: 12 th May 2010.

A Silvio Pardi on behalf of the SuperB Collaboration a INFN-Napoli -Campus di M.S.Angelo Via Cinthia– 80126, Napoli, Italy CHEP12 – New York – USA – May.

Storage Classes report GDB Oct Artem Trunov

Handling of T1D0 in CCRC’08 Tier-0 data handling Tier-1 data handling Experiment data handling Reprocessing Recalling files from tape Tier-0 data handling,

Maria Girone, CERN CMS Experiment Status, Run II Plans, & Federated Requirements Maria Girone, CERN XrootD Workshop, January 27, 2015.

Ian Bird Overview Board; CERN, 8 th March 2013 March 6, 2013

CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.

Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.

A Data Handling System for Modern and Future Fermilab Experiments Robert Illingworth Fermilab Scientific Computing Division.

1 June 11/Ian Fisk CMS Model and the Network Ian Fisk.

Storage & Database Team Activity Report INFN CNAF,

Data Infrastructure in the TeraGrid Chris Jordan Campus Champions Presentation May 6, 2009.

Hepix spring 2012 Summary SITE:

1 An unattended, fault-tolerant approach for the execution of distributed applications Manuel Rodríguez-Pascual, Rafael Mayo-García CIEMAT Madrid, Spain.

PRIN STOA-LHC: STATUS BARI BOLOGNA-18 GIUGNO 2014 Giorgia MINIELLO G. MAGGI, G. DONVITO, D. Elia INFN Sezione di Bari e Dipartimento Interateneo.

Riccardo Zappi INFN-CNAF SRM Breakout session. February 28, 2012 Ingredients 1. Basic ingredients (Fabric & Conn. level) 2. (Grid) Middleware ingredients.

Federating Data in the ALICE Experiment

Dynamic Extension of the INFN Tier-1 on external resources

WLCG IPv6 deployment strategy

Workload Management Workpackage

Integrating Disk into Backup for Faster Restores

Distributed storage, work status

SuperB – INFN-Bari Giacinto DONVITO.

Server Upgrade HA/DR Integration

LCG Service Challenge: Planning and Milestones

Report from WLCG Workshop 2017: WLCG Network Requirements GDB - CERN 12th of July 2017

Silvio Pardi R&D Storage Silvio Pardi

Status and Prospects of The LHC Experiments Computing

Luca dell’Agnello INFN-CNAF

LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.

Evolution of the distributed computing model The case of CMS

Ákos Frohner EGEE'08 September 2008

Gregory Kesden, CSE-291 (Storage Systems) Fall 2017

湖南大学-信息科学与工程学院-计算机与科学系

CloneManager® Helps Users Harness the Power of Microsoft Azure to Clone and Migrate Systems into the Cloud Cost-Effectively and Securely MICROSOFT AZURE.

INFNGRID Workshop – Bari, Italy, October 2004

Presentation transcript:

Distributed Tier1 scenarios G. Donvito INFN-BARI

Open Questions T0 site is out of these 3-4 sites? – R: some of the proposed scenario fit with a T0 within these site, some other not – Could we build a lightweight T0 out of those 3-4 site with only reliable disk buffer and a good 40Gbps network connection? T1 should also provide “CAF” (Express Analysis Facility)? – R: All the proposed scenario could address this problem, but it is important to know the answer from the beginning The data custodiality is a duty only for T0 or for T1 too? – It is strictly required to do “custodiality” with tapes at T1? – Is there the room to host tape library in each site? Is it foreseen to have other Tier1 in other country? – There will be a “full replication” of the data, or only a fraction of those?

Starting points We will have 3 (+1 smaller) centers in South Italy founded for SuperB We are building ~ “from scratch” so we can easily drive the technical solution The network among those site will be at least at 10Gbps None of those centers already has tape libraries We will surely have other sites involved in SuperB computing – distributed world wide – With greater network latency

Possible layouts 1.Split Data a.Different datasets in different site 2.Replicated Data a.Automatically replicated (with available sw) b.Experiments tool driven (HEP community developed sw) 3.Split Features a.T0, T1, CAF, etc 4.Federated T1 a.Split data and Split service

Split Data - Description LHC model: already in production The experiment split and associate data to each of the sites – The association could be driven by physics requirements (community interest) or by computing requirements (size of datasets, processing time, etc) All the sites are identical in terms of service – Could be different in terms of size Each site should run all the steps of the experiment workflow

Split Data - Characteristics Each site could choose its own hw/sw solution – If the solution is “blessed by the experiment” (i.e.: there are the needed tool to exploit the solution) If a site is down, a not negligible fraction of the data is un- available This model is easy to develop: no need to understand how/if it works – LHC is testing it deeply Each site should have (if required from computing model) the duty for its own data custodiality – This need a tape system on each site or (something similar) This is costly! This will require a good data movement tool to move data from T0 to the T1 – I guess we always need this

Split Data – Characteristics (2) The network latency (and bandwidth) among sites it is not a problem The data on each site should be accessed by jobs sent on the same site – Remote access is surely less efficient If we need a “CAF” facility in each Tier1 we need to arrange it somehow – i.e.: Using a dedicated facility This model perfectly fits the cooperation with other(s) Tier1 in different country

Split Data SuperB Experimental data Tier0Tier1 1/3 of data Tier1 “CAF” Tier2 “CAF” Tier2 1/3 of data Tier1 “CAF” Tier2

Split Data – Technological option CNAF model: – GPFS + TSM (or Lustre + HPSS) CERN model: – EOS + MSS (staging has to be done somehow “manually”) dCache infrastructure Standard “Scalla” installation

Replicated Data – Description All the critical data are replicated in each site – We could assure “custodiality” without using tapes? Each site should have enough disk space to store all the critical data for the experiment – The less critical data, could have less than 3 copies All the sites are identical in terms of service – There could be a small difference in terms of size Each site could run all the steps of the experiment workflow – The job could be submitted were there are CPU available Job scheduling is not data driven as each site has the data

Replicated Data - Characteristics If a site is down (or overloaded), this do not affect at all the experiment community We can avoid having “high costly” and “difficult to manage” tape infrastructure This scenario will work far better if we can provide a good network bandwidth among sites It is important to understand how to keep in sync the 3 sites – There are basically 2 option: “Experiment made” solution Public available solution We need to understand how this fits with TCO of the storage solution (power and disks) Each site could be “disk only” – In principle there is no need to have a separate solution to implement a “CAF”

Replicated Data (exp made solution) - Characteristics Each site could choose its own hw/sw implementation The replication tool should be: – resilient and well tested A failure in this system could cause a data loss – Lightweight for the storage system The storage system could not be overloaded by “routine activities” – Able to automatically deal with disk failure at each site Basically you need some cksum features

Replicated Data (exp made solution) SuperB Experimental data Tier0 Tier1 Critical data Tier1 Tier2 Tier1 Tier2 SuperB Replica Service Critical data

Replicated Data (exp made solution) – Technological option CNAF model: – GPFS + TSM (or Lustre + HPSS) CERN model: – EOS + MSS (staging has to be done somehow “manually”) dCache infrastructure Standard “Scalla” installation

Replicated Data (public available solution) - Characteristics No need to maintain it – Widely tested and maybe less buggy! – If it is an open source solution we could only adapt to our specific needs The solution means having a single instance of the storage system distributed among the sites – A job scheduled in the site “A”, could both read data from “A” or from “B” in a transparent way – There is the need to replicate also metadata For failover purposes Those solutions are always automatically replicating missing files – A job will never fail for a disk failure It could be realized with a “low-cost” distributed disk-only hw solution – DAS or WN disks This solution could fit very well with a CDN: where other smaller site could be simple volatile disk cache – If we have another Tier1, this will easily be “yet another stream” going out from the Tier0 – This will put far less load on the experiment DMS It is required that all the sites choose the same hw/sw solution

Replicated Data (public available solution) SuperB Experimental data Tier0Tier1 Critical data Tier1 Tier2 Tier1 Tier2

Replicated Data (public available solution) – Technological option Posix FileSystem: – GPFS (3 Data & Metadata Replicas) Xrootd Based: – EOS General Solution: – Hadoop FS dCache infrastructure ?? – Depends on the availability and reliability of the “ReplicaManager” feature

Split Features - Description Each site is specialized to do specific task – T0, T1, CAF etc Each site could have all the resource needed to do a “task” for the collaboration Both T0 and T1 should have a tape archive – CAF could be disk only An “experiment made” tool takes care of moving (staging) data among sites Each site should run only well defined steps of the experiment workflow – The job submission/match making is “definitely simple”

Split Features - Characteristics The site is free to choose hw/sw solution If a site goes down at least one step of the chain for all the experiment is blocked Each site has a dedicated infrastructure focused and build to do a specified task – This will mean that is easy to reach a good efficiency The network bandwidth and latency between sites is not a big problem

Split Features – Technological option T0 / T1: – CNAF model: GPFS + TSM (or Lustre + HPSS) – CERN model: EOS + CASTOR (staging has to be done somehow “manually”) – dCache infrastructure CAF: – Lustre – EOS

Split Features SuperB Experimental data Tier0 Experiment data Tier1 Tier2 CAF CAF Data

Federated Tier1 - Description Split Data and Split services is not very interesting in our case – We already described the “split data” and “split service” solution

List of service Tape Skimming CAF MC ? Chaotic analysis ? ??

Summary Split Data: – Standard LHC model each: each T1 responsible for a fraction of data custodiality (a tape library in each site) – A site down -> a fraction of data could not be available Replicated Data: – No need to have tape: we have 3 copies of data on cheaper disks in each site – If a sites goes down the other can be used => no service disruption – At least two different solution here, but we should prefer “public-available solutions” Slit features: – Each site has a specific “tasks” to execute (Archive, skimming, CAF, etc) – Only one site need tape archive – If a site goes down a “function” for the experiment is stopped Split features – Split data: – We think this is not so interesting for us! The user experience should be transparent to the layout implemented: – The gateway should take care of distribute jobs thanking care of the computing/storage infrastructure and the jobs requirements

Timeline proposal End 2011 – Jan 2012 – Find an agreement within the collaboration on the “Open Questions” Jan 2012 – Find an agreement within the collaboration on a prioritized list of scenario which we are interested Feb 2012 start with technologies evaluation – At least two different solution for each scenario – At least first two scenarios – This could be done in parallel in different sites April 2012 start with distributed testbed at least with 2 sites – The “most interesting” scenario and solution May 2012 third site joining June 2012 testing “back-up” solution in the distributed testbed July 2012 Report to the collaboration PON approved?? Agreement on Scenarios Computing TDR Testing technical solution Report on results