ALICE Computing Upgrade Predrag Buncic

Slides:



Advertisements
Similar presentations
31/03/00 CMS(UK)Glenn Patrick What is the CMS(UK) Data Model? Assume that CMS software is available at every UK institute connected by some infrastructure.
Advertisements

Status GridKa & ALICE T2 in Germany Kilian Schwarz GSI Darmstadt.
T1 at LBL/NERSC/OAK RIDGE General principles. RAW data flow T0 disk buffer DAQ & HLT CERN Tape AliEn FC Raw data Condition & Calibration & data DB disk.
December Pre-GDB meeting1 CCRC08-1 ATLAS’ plans and intentions Kors Bos NIKHEF, Amsterdam.
Ian Fisk and Maria Girone Improvements in the CMS Computing System from Run2 CHEP 2015 Ian Fisk and Maria Girone For CMS Collaboration.
Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.
1 Kittikul Kovitanggoon*, Burin Asavapibhop, Narumon Suwonjandee, Gurpreet Singh Chulalongkorn University, Thailand July 23, 2015 Workshop on e-Science.
LHC Computing Review - Resources ATLAS Resource Issues John Huth Harvard University.
ALICE Upgrade for Run3: Computing HL-LHC Trigger, Online and Offline Computing Working Group Topical Workshop Sep 5 th 2014.
Predrag Buncic CERN ALICE Status Report LHCC Referee Meeting September 22, 2015.
14 Aug 08DOE Review John Huth ATLAS Computing at Harvard John Huth.
ALICE – networking LHCONE workshop 10/02/ Quick plans: Run 2 data taking Both for Pb+Pb and p+p – Reach 1 nb -1 integrated luminosity for rare.
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.
Predrag Buncic, October 3, 2013 ECFA Workshop Aix-Les-Bains - 1 Computing at the HL-LHC Predrag Buncic on behalf of the Trigger/DAQ/Offline/Computing Preparatory.
CERN – IT Department CH-1211 Genève 23 Switzerland t Working with Large Data Sets Tim Smith CERN/IT Open Access and Research Data Session.
2012 RESOURCES UTILIZATION REPORT AND COMPUTING RESOURCES REQUIREMENTS September 24, 2012.
Predrag Buncic, October 3, 2013 ECFA Workshop Aix-Les-Bains - 1 Computing at the HL-LHC Predrag Buncic on behalf of the Trigger/DAQ/Offline/Computing Preparatory.
Status report of the KLOE offline G. Venanzoni – LNF LNF Scientific Committee Frascati, 9 November 2004.
CBM Computing Model First Thoughts CBM Collaboration Meeting, Trogir, 9 October 2009 Volker Friese.
Predrag Buncic Future IT challenges for ALICE Technical Workshop November 6, 2015.
AliRoot survey: Analysis P.Hristov 11/06/2013. Are you involved in analysis activities?(85.1% Yes, 14.9% No) 2 Involved since 4.5±2.4 years Dedicated.
Workflows and Data Management. Workflow and DM Run3 and after: conditions m LHCb major upgrade is for Run3 (2020 horizon)! o Luminosity x 5 ( )
Predrag Buncic ALICE Status Report LHCC Referee Meeting CERN
ATLAS Distributed Computing perspectives for Run-2 Simone Campana CERN-IT/SDC on behalf of ADC.
Predrag Buncic CERN ALICE Status Report LHCC Referee Meeting 01/12/2015.
Ian Bird WLCG Networking workshop CERN, 10 th February February 2014
LHCbComputing Computing for the LHCb Upgrade. 2 LHCb Upgrade: goal and timescale m LHCb upgrade will be operational after LS2 (~2020) m Increase significantly.
The ATLAS Computing & Analysis Model Roger Jones Lancaster University ATLAS UK 06 IPPP, 20/9/2006.
ALICE Computing Model A pictorial guide. ALICE Computing Model External T1 CERN T0 During pp run i (7 months): P2: data taking T0: first reconstruction.
ALICE O 2 | 2015 | Pierre Vande Vyvre O 2 Project Pierre VANDE VYVRE.
Main parameters of Russian Tier2 for ATLAS (RuTier-2 model) Russia-CERN JWGC meeting A.Minaenko IHEP (Protvino)
16 September 2014 Ian Bird; SPC1. General ALICE and LHCb detector upgrades during LS2  Plans for changing computing strategies more advanced CMS and.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
Domenico Elia1 ALICE computing: status and perspectives Domenico Elia, INFN Bari Workshop CCR INFN / LNS Catania, Workshop Commissione Calcolo.
WLCG November Plan for shutdown and 2009 data-taking Kors Bos.
Jianming Qian, UM/DØ Software & Computing Where we are now Where we want to go Overview Director’s Review, June 5, 2002.
ALICE Physics Data Challenge ’05 and LCG Service Challenge 3 Latchezar Betev / ALICE Geneva, 6 April 2005 LCG Storage Management Workshop.
Status Report on Data Reconstruction May 2002 C.Bloise Results of the study of the reconstructed events in year 2001 Data Reprocessing in Y2002 DST production.
Computing infrastructures for the LHC: current status and challenges of the High Luminosity LHC future Worldwide LHC Computing Grid (WLCG): Distributed.
Predrag Buncic CERN Data management in Run3. Roles of Tiers in Run 3 Predrag Buncic 2 ALICEALICE ALICE Offline Week, 01/04/2016 Reconstruction Calibration.
Predrag Buncic CERN Plans for Run2 and the ALICE upgrade in Run3 ALICE Tier-1/Tier-2 Workshop February 2015.
LHCb LHCb GRID SOLUTION TM Recent and planned changes to the LHCb computing model Marco Cattaneo, Philippe Charpentier, Peter Clarke, Stefan Roiser.
1 P. Murat, Mini-review of the CDF Computing Plan 2006, 2005/10/18 An Update to the CDF Offline Plan and FY2006 Budget ● Outline: – CDF computing model.
LHCb Computing 2015 Q3 Report Stefan Roiser LHCC Referees Meeting 1 December 2015.
Predrag Buncic CERN ALICE Status Report LHCC Referee Meeting 24/05/2015.
ATLAS – statements of interest (1) A degree of hierarchy between the different computing facilities, with distinct roles at each level –Event filter Online.
Data Formats and Impact on Federated Access
Atlas IO improvements and Future prospects
T1/T2 workshop – 7th edition - Strasbourg 3 May 2017 Latchezar Betev
Ian Bird WLCG Workshop San Francisco, 8th October 2016
WP18, High-speed data recording Krzysztof Wrona, European XFEL
SuperB and its computing requirements
Predrag Buncic ALICE Status Report LHCC Referee Meeting CERN
evoluzione modello per Run3 LHC
Workshop Computing Models status and perspectives
Data Challenge with the Grid in ATLAS
ALICE analysis preservation
for the Offline and Computing groups
ALICE – First paper.
Bernd Panzer-Steindel, CERN/IT
Status and Prospects of The LHC Experiments Computing
Operations in 2012 and plans for the LS1
Readiness of ATLAS Computing - A personal view
Dagmar Adamova (NPI AS CR Prague/Rez) and Maarten Litmaath (CERN)
Disk capacities in 2017 and 2018 ALICE Offline week 12/11/2017.
ALICE Computing Model in Run3
ILD Ichinoseki Meeting
New strategies of the LHC experiments to meet
Computing at the HL-LHC
The ATLAS Computing Model
Presentation transcript:

ALICE Computing Upgrade Predrag Buncic

Computing Model in Run 1&2 Computing model built on top of WLCG services Any job can run anywhere and access data from any place Scheduling takes care of optimizations and brings “computing to data” Every file and its replicas are accounted in the file catalogue Worked (surprisingly) well during Run 2 and Run 2

Run 3 data taking objectives For Pb-Pb collisions: Reach the target of 1 13 nb-1 integrated luminosity in Pb-Pb for rare triggers. The resulting data throughput from the detector has been estimated to be greater than 1TB/s for Pb–Pb events, roughly two orders of magnitude more than in Run 1 10 GB/s 90 GB/s 3 TB/s 500 Hz 50 kHz Factor 90 in terms of Pb-Pb events (x 30 for pp)

http://goo.gl/T4NeUp

Run 3 Computing Model

New in Run 3: O2 facility 463 FPGAs Detector readout and fast cluster finder 100’000 CPU cores To compress 1.1 TB/s data stream by overall factor 14 3000 GPUs To speed up the reconstruction 3 CPU1) + 1 GPU2) = 28 CPUs 60 PB of disk To buy us an extra time and allow more precise calibration ---------------------------------------------------------------- Considerable (but heterogeneous) computing capacity that will be used for Online and Offline tasks Identical s/w should work in Online and Offline environments 1) Intel Sandy Bridge, 2GHz, 8 core, E5-2650 2) AMD S9000

Changes to data management policies Only one instance of each raw data file (CTF) stored on disk with a backup on tape In case of data loss, we will restore lost files form the tape O2 disk buffer should be sufficient accommodate CTF data from the entire period. As soon as it is available, the CTF data will be archived to the Tier 0 tape buffer or moved to the Tier 1s All other intermediate data created at various processing stages is transient (removed after a given processing step) or temporary (with limited lifetime) Only CTF and AODs are archived kept on disk to tape Given the limited size of the disk buffers in O2 and Tier 1s, all CTF data collected in the previous year, will have to be removed before new data taking period starts. Reconstruct Archive Calibrate Re-Reconstruct

Rebalancing disk/CPU ratio on T2s CPU (cores) Disk (PB) Expecting Grid resources (CPU, storage) to grow at 20% per year rate Large number of disk will be used by Run 1 and Run 2 data Since T2s will be used almost exclusively for simulation jobs (no input) and resulting AODs will be exported to T1s/AFs, we expect to significantly lower needs for storage on T2s

Optimizing Efficiency Wall time / CPU time

Analysis Facilities Solution Motivation Input File(s) T 5 Output File T 1 T 3 T 2 T 4 T 6 Motivation Analysis remains /O bound in spite of attempts to make it more efficient by using the train approach Solution Collect AODs on a dedicated sites that are optimized for fast processing of a large local datasets Run organized analysis on local data like we do today on the Grid Requires 20-30’000 cores and 5-10 PB of disk on very performant file system Such sites can be elected between the existing T1s (or even T2s) but ideally this would be a purpose build facility optimized for such workflow

Roles of Tiers in Run 3 T0/T1 AF O2 T2/HPC Reconstruction Calibration CTF -> ESD -> AOD AF AOD -> HISTO, TREE O2 RAW -> CTF -> ESD -> AOD 1 T2/HPC MC -> CTF -> ESD 1..n 1..3 CTF AOD Reconstruction Calibration Archiving Analysis Reconstruction Calibration Analysis Simulation

Run 2 vs Run 3: Data volume and # of events Raw data volume x 4.2 x 88 x 30 While event statistics will increase by factor 30in Pb-Pb (x88 in pp), data volume will increase by factor 4.2 Thanks to data reduction in O2 facility Online tracking that allows rejection of clusters not associated with tracks Large effect in case of pileup (pp)

Flat budget scenario Overall, we seem to fit under projected resource growth curves Still doable under the flat budget scenario

Analysis Facilities are new in our Computing Model So far we have secured one at GSI Assumed growth in disk and CPU: 10% We need to look for more AF facilities or re-purpose some of the T1s for that purpose

Work Packages Anyone wants to help ALICE build Analysis Facilities? http://goo.gl/QZUE1d