ALICE Grid operations +some specific for T2s US-ALICE Grid operations review 7 March 2014 Latchezar Betev 1.

Slides:



Advertisements
Similar presentations
Grid operations in 2014 ALICE Offline week 20 March 2015 Latchezar Betev.
Advertisements

1 User Analysis Workgroup Update  All four experiments gave input by mid December  ALICE by document and links  Very independent.
– Unfortunately, this problems is not yet fully under control – No enough information from monitoring that would allow us to correlate poor performing.
Batch Production and Monte Carlo + CDB work status Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
ALICE Operations short summary and directions in 2012 Grid Deployment Board March 21, 2011.
ALICE Operations short summary LHCC Referees meeting June 12, 2012.
ALICE Operations short summary and directions in 2012 WLCG workshop May 19-20, 2012.
ALICE DATA ACCESS MODEL Outline ALICE data access model - PtP Network Workshop 2  ALICE data model  Some figures.
ALICE Tier-2 at Hiroshima Toru Sugitate of Hiroshima University for ALICE-Japan GRID Team LHCONE workshop at the APAN 38 th.
Ian Fisk and Maria Girone Improvements in the CMS Computing System from Run2 CHEP 2015 Ian Fisk and Maria Girone For CMS Collaboration.
ALICE data access WLCG data WG revival 4 October 2013.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
- Iain Bertram R-GMA and DØ Iain Bertram RAL 13 May 2004 Thanks to Jeff Templon at Nikhef.
ALICE Upgrade for Run3: Computing HL-LHC Trigger, Online and Offline Computing Working Group Topical Workshop Sep 5 th 2014.
Status of the production and news about Nagios ALICE TF Meeting 22/07/2010.
Predrag Buncic CERN ALICE Status Report LHCC Referee Meeting September 22, 2015.
Costin Grigoras ALICE Offline. In the period of steady LHC operation, The Grid usage is constant and high and, as foreseen, is used for massive RAW and.
Sejong STATUS Chang Yeong CHOI CERN, ALICE LHC Computing Grid Tier-2 Workshop in Asia, 1 th December 2006.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
Status of PDC’06 Latchezar Betev TF meeting – September 28, 2006.
ALICE – networking LHCONE workshop 10/02/ Quick plans: Run 2 data taking Both for Pb+Pb and p+p – Reach 1 nb -1 integrated luminosity for rare.
Update on replica management
ALICE Grid operations: last year and perspectives (+ some general remarks) ALICE T1/T2 workshop Tsukuba 5 March 2014 Latchezar Betev Updated for the ALICE.
Status of PDC’07 and user analysis issues (from admin point of view) L. Betev August 28, 2007.
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
SLACFederated Storage Workshop Summary For pre-GDB (Data Access) Meeting 5/13/14 Andrew Hanushevsky SLAC National Accelerator Laboratory.
Xrootd Monitoring and Control Harsh Arora CERN. Setting Up Service  Monalisa Service  Monalisa Repository  Test Xrootd Server  ApMon Module.
23.March 2004Bernd Panzer-Steindel, CERN/IT1 LCG Workshop Computing Fabric.
2012 RESOURCES UTILIZATION REPORT AND COMPUTING RESOURCES REQUIREMENTS September 24, 2012.
Feb. 14, 2002DØRAM Proposal DØ IB Meeting, Jae Yu 1 Proposal for a DØ Remote Analysis Model (DØRAM) Introduction Partial Workshop Results DØRAM Architecture.
LCG-LHCC mini-review ALICE Latchezar Betev Latchezar Betev for the ALICE collaboration.
OPERATIONS REPORT JUNE – SEPTEMBER 2015 Stefan Roiser CERN.
ATLAS Distributed Computing perspectives for Run-2 Simone Campana CERN-IT/SDC on behalf of ADC.
Ian Bird WLCG Networking workshop CERN, 10 th February February 2014
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
Monitoring the Readiness and Utilization of the Distributed CMS Computing Facilities XVIII International Conference on Computing in High Energy and Nuclear.
GRID interoperability and operation challenges under real load for the ALICE experiment F. Carminati, L. Betev, P. Saiz, F. Furano, P. Méndez Lorenzo,
Domenico Elia1 ALICE computing: status and perspectives Domenico Elia, INFN Bari Workshop CCR INFN / LNS Catania, Workshop Commissione Calcolo.
ALICE Physics Data Challenge ’05 and LCG Service Challenge 3 Latchezar Betev / ALICE Geneva, 6 April 2005 LCG Storage Management Workshop.
ALICE computing Focus on STEP09 and analysis activities ALICE computing Focus on STEP09 and analysis activities Latchezar Betev Réunion LCG-France, LAPP.
Dominique Boutigny December 12, 2006 CC-IN2P3 a Tier-1 for W-LCG 1 st Chinese – French Workshop on LHC Physics and associated Grid Computing IHEP - Beijing.
Apr. 25, 2002Why DØRAC? DØRAC FTFM, Jae Yu 1 What do we want DØ Regional Analysis Centers (DØRAC) do? Why do we need a DØRAC? What do we want a DØRAC do?
ATLAS Computing Model Ghita Rahal CC-IN2P3 Tutorial Atlas CC, Lyon
Computing infrastructures for the LHC: current status and challenges of the High Luminosity LHC future Worldwide LHC Computing Grid (WLCG): Distributed.
Predrag Buncic CERN Plans for Run2 and the ALICE upgrade in Run3 ALICE Tier-1/Tier-2 Workshop February 2015.
Pledged and delivered resources to ALICE Grid computing in Germany Kilian Schwarz GSI Darmstadt ALICE Offline Week.
Grid Operations in Germany T1-T2 workshop 2015 Torino, Italy Kilian Schwarz WooJin Park Christopher Jung.
Predrag Buncic CERN ALICE Status Report LHCC Referee Meeting 24/05/2015.
Availability of ALICE Grid resources in Germany Kilian Schwarz GSI Darmstadt ALICE Offline Week.
1 LCG-France 22 November 2010 Tier2s connectivity requirements 22 Novembre 2010 S. Jézéquel (LAPP-ATLAS)
T1/T2 workshop – 7th edition - Strasbourg 3 May 2017 Latchezar Betev
Ian Bird WLCG Workshop San Francisco, 8th October 2016
ALICE internal and external network
Report from WLCG Workshop 2017: WLCG Network Requirements GDB - CERN 12th of July 2017
Belle II Physics Analysis Center at TIFR
Predrag Buncic ALICE Status Report LHCC Referee Meeting CERN
ALICE Monitoring
Workshop Computing Models status and perspectives
Data Challenge with the Grid in ATLAS
Update on Plan for KISTI-GSDC
Operations in 2012 and plans for the LS1
ALICE Computing : 2012 operation & future plans
Dagmar Adamova (NPI AS CR Prague/Rez) and Maarten Litmaath (CERN)
Disk capacities in 2017 and 2018 ALICE Offline week 12/11/2017.
Simulation use cases for T2 in ALICE
TYPES OF SERVER. TYPES OF SERVER What is a server.
ALICE Computing Model in Run3
ALICE Computing Upgrade Predrag Buncic
TYPES OFF OPERATING SYSTEM
ILD Ichinoseki Meeting
Presentation transcript:

ALICE Grid operations +some specific for T2s US-ALICE Grid operations review 7 March 2014 Latchezar Betev 1

The ALICE Grid 53 in Europe 10 in Aisa 8 operational 2 future 2 in Africa 1 operational 1 future 2 in South America 1 operational 1 future 8 in North America 4 operational 4 future + 1 past 2

Grid job profile in 2013 Average 36K jobs 3

Grid job profile in 2013 – T2s Average 17K jobs 4

Resources delivery distribution The remarkable 50/50 share T1/T2 is still alive and well 5

Resources delivery T2s LBL (6%) + LLNL (4%) + OSC (0.5%) = 10.5% of total T2 6

Job mixture 69% MC, 8% RAW, 11% LEGO, 12% individual, 447 individual users 7

Access to data (disk SEs) 69 SEs, 29PB in, 240PB out, ~10/1 read/write 8

Access to data (disk SEs, T2s) 62 SEs, 17PB in, 83PB out, ~5/1 read/write 9

Access to data (disk SEs, US) 2 SEs, 2PB in (12% of T2s) 13PB out (16% ot T2s), ~7/1 read/write 10

Data access 2 99% of the data read are input (ESDs/AODs) to analysis jobs, the remaining 1% are configurations and macros From LEGO train statistics, ~93% of the data is read locally – The job is sent to the data The 7% is file cannot be accessed locally (either server not returning it or file missing) – In all such cases, the file is read remotely – Or the job has waited for too long and is allowed to run anywhere to complete the train (last train jobs) Eliminating some of the remote access (not all possible) will increase the global efficiency by few percent – This is not a showstopper at all, especially with better network 11

Storage availability More important question – availability of storage ALICE computing model – 2 replicas => if SE is down, we lose efficiency and may overload the remaining SE – The CPU resources must access data remotely, otherwise there will be not enough to satisfy the demand In the future, we may be forced to go to one replica – Cannot be done for popular data 12

Storage availability (2) Average SE availability in the last year: 86% 13

Storage availability – T2s Average SE availability in the last year: 80% 14

Storage availability – US Average SE availability in the last year: 97% (!) 15

Other services Nothing special to report – Services are mature and stable – Operators are well aware of what is to be done and where – Ample monitoring is available for every service (more on this will be reported throughout the workshop) – Personal reminders needed from time to time – Several services updates were done in 2013… 16

The Efficiency Average of all sites: 75% (unweighted) 17

Closer look – T0/T1s Average – 85% (unweighted) 18

Closer look – US Average –75% (unweighted) 19

Summary on efficiency Stable throughout the year T2s efficiencies are not much below T0/T1s – It is possible to equalize all, it is in the storage and networking Biggest gains through – Inter-sites network improvement (LHCONE); – Storage – keep it simple – xrootd works best directly on a Linux FS and on generic storage boxes 20

What’s in store for 2014 Production and analysis will not stop – know how to handle these, nothing to worry about – Some of the RAW data production is left over from 2013 Another ‘flat’ resources year – no increase in requirements Year 2015 – Start of LHC RUN2 - higher luminosity, higher energy – Upgraded ALICE detector/DAQ – higher data taking rate; basically 2x the RUN1 rate 21

What’s in store for sites We should finish with the largest upgrades before March 2015 – Storage – new xrootd/EOS – Services updates – Network – IPv6, LHCONE – New sites installation – Indonesia, US, Mexico, South Africa – Build and validate new T1s – UNAM, RRC-KI (already on the way) 22

Summary Stable and productive Grid operations in 2013 Resources fully used Software updates successfully completed MC productions completed according to requests and planning – Next year – continue with RAW data reprocessing and associated MC Analysis – OK focus on SE consolidation, resources ramp- up for 2015 (where applicable), networking, new sites installation and validation 23