CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/i t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.

Slides:



Advertisements
Similar presentations
Status GridKa & ALICE T2 in Germany Kilian Schwarz GSI Darmstadt.
Advertisements

 Contributing >30% of throughput to ATLAS and CMS in Worldwide LHC Computing Grid  Reliant on production and advanced networking from ESNET, LHCNET and.
T1 at LBL/NERSC/OAK RIDGE General principles. RAW data flow T0 disk buffer DAQ & HLT CERN Tape AliEn FC Raw data Condition & Calibration & data DB disk.
12. March 2003Bernd Panzer-Steindel, CERN/IT1 LCG Fabric status
The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Dr Ian Bird LCG Project Leader 15 th April 2009 Visit of Spanish Royal Academy.
ALICE Operations short summary LHCC Referees meeting June 12, 2012.
Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.
CERN/IT/DB Multi-PB Distributed Databases Jamie Shiers IT Division, DB Group, CERN, Geneva, Switzerland February 2001.
Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
Status Report on Tier-1 in Korea Gungwon Kang, Sang-Un Ahn and Hangjin Jang (KISTI GSDC) April 28, 2014 at 15th CERN-Korea Committee, Geneva Korea Institute.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Emmanuel Tsesmelis 2 nd CERN School Thailand 2012 Suranaree University of Technology.
Experience with the WLCG Computing Grid 10 June 2010 Ian Fisk.
Claudio Grandi INFN Bologna CMS Operations Update Ian Fisk, Claudio Grandi 1.
Preparation of KIPT (Kharkov) computing facilities for CMS data analysis L. Levchuk Kharkov Institute of Physics and Technology (KIPT), Kharkov, Ukraine.
Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
LCG Service Challenge Phase 4: Piano di attività e impatto sulla infrastruttura di rete 1 Service Challenge Phase 4: Piano di attività e impatto sulla.
The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Dr Ian Bird LCG Project Leader 25 th April 2012.
CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 1 Tier0 Status Tony Cass LCG-LHCC Referees Meeting 18 th November 2008.
CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 1 Tier0 Status Tony Cass LCG-LHCC Referees Meeting 6 th July 2009.
Grid Lab About the need of 3 Tier storage 5/22/121CHEP 2012, The need of 3 Tier storage Dmitri Ozerov Patrick Fuhrmann CHEP 2012, NYC, May 22, 2012 Grid.
10/22/2002Bernd Panzer-Steindel, CERN/IT1 Data Challenges and Fabric Architecture.
Status Report of WLCG Tier-1 candidate for KISTI-GSDC Sang-Un Ahn, for the GSDC Tier-1 Team GSDC Tier-1 Team 12 th CERN-Korea.
Progress in Computing Ian Bird ICHEP th July 2010, Paris
Management of the LHCb DAQ Network Guoming Liu * †, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
Ian Bird LHC Computing Grid Project Leader LHC Grid Fest 3 rd October 2008 A worldwide collaboration.
The LHC Computing Grid – February 2008 The Challenges of LHC Computing Dr Ian Bird LCG Project Leader 6 th October 2009 Telecom 2009 Youth Forum.
CERN - IT Department CH-1211 Genève 23 Switzerland t Oracle Real Application Clusters (RAC) Techniques for implementing & running robust.
1 LHCb on the Grid Raja Nandakumar (with contributions from Greig Cowan) ‏ GridPP21 3 rd September 2008.
WLCG and the India-CERN Collaboration David Collados CERN - Information technology 27 February 2014.
Status Report of WLCG Tier-1 candidate for KISTI-GSDC Sang-Un Ahn, for the GSDC Tier-1 Team GSDC Tier-1 Team ATHIC2012, Busan,
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Automatic server registration and burn-in framework HEPIX’13 28.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS Castor incident (and follow up) Alberto Pace.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Dr. Andreas Wagner Deputy Group Leader - Operating Systems and Infrastructure Services CERN IT Department The IT Department & The LHC Computing Grid –
Project Status Report Ian Bird Computing Resource Review Board 20 th April, 2010 CERN-RRB
Computing for LHC Physics 7th March 2014 International Women's Day - CERN- GOOGLE Networking Event Maria Alandes Pradillo CERN IT Department.
LHC Computing, CERN, & Federated Identities
Computing Issues for the ATLAS SWT2. What is SWT2? SWT2 is the U.S. ATLAS Southwestern Tier 2 Consortium UTA is lead institution, along with University.
tons, 150 million sensors generating data 40 millions times per second producing 1 petabyte per second The ATLAS experiment.
Ian Bird WLCG Networking workshop CERN, 10 th February February 2014
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
CERN IT Department CH-1211 Genève 23 Switzerland t SL(C) 5 Migration at CERN CHEP 2009, Prague Ulrich SCHWICKERATH Ricardo SILVA CERN, IT-FIO-FS.
CERN IT Department CH-1211 Genève 23 Switzerland t The Tape Service at CERN Vladimír Bahyl IT-FIO-TSI June 2009.
The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Dr Ian Bird LCG Project Leader 1 st March 2011 Visit of Dr Manuel Eduardo Baldeón.
SRM v2.2 Production Deployment SRM v2.2 production deployment at CERN now underway. – One ‘endpoint’ per LHC experiment, plus a public one (as for CASTOR2).
The Worldwide LHC Computing Grid Frédéric Hemmer IT Department Head Visit of INTEL ISEF CERN Special Award Winners 2012 Thursday, 21 st June 2012.
ALICE Computing Model A pictorial guide. ALICE Computing Model External T1 CERN T0 During pp run i (7 months): P2: data taking T0: first reconstruction.
CERN IT Department CH-1211 Genève 23 Switzerland t CCRC’08 Review from a DM perspective Alberto Pace (With slides from T.Bell, F.Donno, D.Duelmann,
WLCG Status Report Ian Bird Austrian Tier 2 Workshop 22 nd June, 2010.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
Status of GSDC, KISTI Sang-Un Ahn, for the GSDC Tier-1 Team
1 June 11/Ian Fisk CMS Model and the Network Ian Fisk.
Dominique Boutigny December 12, 2006 CC-IN2P3 a Tier-1 for W-LCG 1 st Chinese – French Workshop on LHC Physics and associated Grid Computing IHEP - Beijing.
IHEP Computing Center Site Report Gang Chen Computing Center Institute of High Energy Physics 2011 Spring Meeting.
CERN IT Department CH-1211 Genève 23 Switzerland t EIS Section input to GLM For GLM attended by Director for Computing.
ATLAS Computing: Experience from first data processing and analysis Workshop TYL’10.
CERN IT Department CH-1211 Genève 23 Switzerland t EGEE09 Barcelona ATLAS Distributed Data Management Fernando H. Barreiro Megino on behalf.
Grid site as a tool for data processing and data analysis
The LHC Computing Grid Visit of Mtro. Enrique Agüera Ibañez
Experiences and Outlook Data Preservation and Long Term Analysis
Update on Plan for KISTI-GSDC
Philippe Charpentier CERN – LHCb On behalf of the LHCb Computing Group
The LHC Computing Grid Visit of Her Royal Highness
Castor services at the Tier-0
Dagmar Adamova (NPI AS CR Prague/Rez) and Maarten Litmaath (CERN)
The LHC Computing Grid Visit of Professor Andreas Demetriou
Presentation transcript:

CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from a Data Center Perspective CERN School of Computing 2010 Uxbridge, U.K.

CERN IT Department CH-1211 Genève 23 Switzerland t CERN School of Computing GB/sec (ions) Frédéric Hemmer2

CERN IT Department CH-1211 Genève 23 Switzerland t 1430MB/s 700MB/s1120MB/s 700MB/s 420MB/s (1600MB/s)(2000MB/s) Averages! Need to be able to support 2x for recovery! Scheduled work only! Tier-0 Computing Tasks

CERN IT Department CH-1211 Genève 23 Switzerland t CERN School of Computing 2010 The CERN Tier-0 in Numbers Data Centre Operations (Tier 0) –24x7 operator support and System Administration services to support 24x7 operation of all IT services. –Hardware installation & retirement ~7,000 hardware movements/year; ~1000 disk failures/year –Management and Automation framework for large scale Linux clusters 4 High Speed Routers (640 Mbps → 2.4 Tbps) 24 Ethernet Switches Gbps ports2000 Switching Capacity4.8 Tbps Servers8,076 Processors13,802 Cores50,855 HEPSpec06359,431 Disks53,728 Raw disk capacity (TB)45,331 Memory modules48,794 RAID controllers3,518 Tape Drives160 Tape Cartridges45000 Tape slots56000 Tape Capacity (TB)34000 Frédéric Hemmer

CERN School of Computing The WLCG Collaboration Frédéric Hemmer Tier-0 (CERN): Data recording Initial data reconstruction Data distribution Tier-1 (11 centres): Permanent storage Re-processing Analysis Tier-2 (~130 centres): Simulation End-user analysis

WLCG running increasingly high workloads: – ~1 million jobs/day Real data processing and re- processing Physics analysis Simulations – ~100 k CPU-days/day Unprecedented data rates Today WLCG Status is: ~100k CPU-days/day Castor traffic over the last 2 months: > 4 GB/s input > 13 GB/s served Castor traffic over the last 2 months: > 4 GB/s input > 13 GB/s served Data export during data taking: - According to expectations on average Data export during data taking: - According to expectations on average Traffic on OPN up to 70 Gb/s! - ATLAS reprocessing campaigns Traffic on OPN up to 70 Gb/s! - ATLAS reprocessing campaigns

Data reaches Tier 2s within hours Increasing numbers of (analysis users) – E.g.:~500 grid users in each ATLAS/CMS; ~200 in ALICE Readiness of the Computing e.g. CMS raw to Tier 1s e.g. ATLAS data distribution ALICE: >200 users ~500 jobs on average over 3 months ALICE: >200 users ~500 jobs on average over 3 months CMS users

CERN IT Department CH-1211 Genève 23 Switzerland t So, everything fine ? Not quite...

9 Worldwide data distribution and analysis GRID-based analysis in June-July 2010: > 1000 different users, ~ 11 million analysis jobs processed Total throughput of ATLAS data through the Grid: from 1 st January until yesterday MB/s per day MC reprocessing JanFebMarchApril May Start of 7 TeV data-taking 6 GB/s Peaks of 10 GB/s achieved ~2 GB/s (design) June July Start of p/bunch operation 2009 data reprocessing Data and MC reprocessing From F. Gianotti ICHEP 2010

A “transparent” configuration change in Castor software resulted in data being directed across all available tape pools instead of to the dedicated raw data pools – For ALICE, ATLAS, CMS this included a pool where the tapes were re-cycled after a certain time The result of this was that a number of files were lost on tapes that were recycled For ATLAS and CMS the tapes had not been overwritten and could be fully recovered (fall back would have been to re-copy files back from Tier 1s) For ALICE 10k files were on tapes that were recycled, including 1973 files of 900 GeV raw data Actions taken: – Underlying problem addressed; all recycle pools removed Software change procedures being reviewed now – Tapes sent to IBM and SUN for recovery – have been able to recover ~97% of critical (900 GeV sample) files, ~50% of all ALICE files We have been very lucky, 1717 tapes were recovered by IBM & STK/Sun/Oracle – Work with ALICE to ensure that always 2 copies of data available In HI running there is a risk for several weeks until all data is copied to Tier 1s; several options to mitigate this risk under discussion – As this was essentially a procedural problem: we will organise a review of Castor operations procedures (software development, deployment, operation, etc...) together with experiments and outside experts. ALICE data loss

CERN IT Department CH-1211 Genève 23 Switzerland t CERN School of Computing 2010 A Word on Computer Security August 2010 It is “your” responsibility to write secure code, and to fix promptly eventual flaws Not only for ensuring proper operation of your application –But also, failure to do so might bring your and other organizations in a very embarrassing position. Frédéric Hemmer11

CERN IT Department CH-1211 Genève 23 Switzerland t CERN School of Computing 2010 Summary The readiness of LHC Computing has now clearly been demonstrated –There are however large load fluctuations –Experiments Computing Models will certainly evolve And WLCG Tiers will have to adapt to those changes –Change Management is critical for ensuring stable data taking Some challenges ahead –Large number of objects with permanent flux of change Automation is critical –(Distributed) Software complexity –Computer Security remains a permanent concern Frédéric Hemmer12