Presentation is loading. Please wait.

Presentation is loading. Please wait.

The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Emmanuel Tsesmelis 2 nd CERN School Thailand 2012 Suranaree University of Technology.

Similar presentations


Presentation on theme: "The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Emmanuel Tsesmelis 2 nd CERN School Thailand 2012 Suranaree University of Technology."— Presentation transcript:

1 The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Emmanuel Tsesmelis 2 nd CERN School Thailand 2012 Suranaree University of Technology 1 May 2012

2 CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it Brazil and CERN / September 2009 2 2 Breaking the Wall of Communication 23 years ago: the Web was born @ CERN Breaking the Wall of Communication 23 years ago: the Web was born @ CERN... and today ?

3 Enter a New Era in Fundamental Science Start-up of the Large Hadron Collider (LHC), one of the largest and truly global scientific projects ever, is the most exciting turning point in particle physics. Exploration of a new energy frontier LHC ring: 27 km circumference CMS ALICE LHCb ATLAS

4

5 5 The LHC Computing Challenge  Signal/Noise: 10 -13 (10 -9 offline)  Data volume High rate * large number of channels * 4 experiments   15 PetaBytes of new data each year  Compute power Event complexity * Nb. events * thousands users  200 k CPUs  45 PB of disk storage  Worldwide analysis & funding Computing funding locally in major regions & countries Efficient analysis everywhere  GRID technology

6 CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it The LHC Data 40 million events (pictures) per second Select (on the fly) the ~200 interesting events per second to write on tape “Reconstruct” data and convert for analysis: “physics data” [  the grid...] (x4 experiments x15 years) Per eventPer year Raw data1.6 MB3200 TB Reconstructed data1.0 MB2000 TB Physics data0.1 MB 200 TB Concorde (15 km) Balloon (30 km) CD stack with 1 year LHC data! (~ 20 km) Mt. Blanc (4.8 km)

7 150 million sensors deliver data … … 40 million times per second

8 8 A collision at LHC 26 June 2009Ian Bird, CERN

9 9 The Data Acquisition 26 June 2009

10 Tier 0 at CERN: Acquisition, First pass reconstruction, Storage & Distribution Ian.Bird@cern.ch 1.25 GB/sec (ions) 10

11 A distributed computing infrastructure to provide the production and analysis environments for the LHC experiments Managed and operated by a worldwide collaboration between the experiments and the participating computer centres The resources are distributed – for funding and sociological reasons Our task was to make use of the resources available to us – no matter where they are located Ian Bird, CERN11 WLCG – what and why? Tier-0 (CERN): Data recording Initial data reconstruction Data distribution Tier-1 (11 centres): Permanent storage Re-processing Analysis Tier-2 (~130 centres): Simulation End-user analysis

12 e-Infrastructure

13 Tier 0 Tier 1 Tier 2 WLCG Grid Sites Today >130 sites >250k CPU cores >150 PB disk Today >130 sites >250k CPU cores >150 PB disk

14 Lyon/CCIN2P3 Barcelona/PIC De-FZK US-FNAL Ca- TRIUMF NDGF CERN US-BNL UK-RAL Taipei/ASGC Ian Bird, CERN1426 June 2009 Today we have 49 MoU signatories, representing 34 countries: Australia, Austria, Belgium, Brazil, Canada, China, Czech Rep, Denmark, Estonia, Finland, France, Germany, Hungary, Italy, India, Israel, Japan, Rep. Korea, Netherlands, Norway, Pakistan, Poland, Portugal, Romania, Russia, Slovenia, Spain, Sweden, Switzerland, Taipei, Turkey, UK, Ukraine, USA. Today we have 49 MoU signatories, representing 34 countries: Australia, Austria, Belgium, Brazil, Canada, China, Czech Rep, Denmark, Estonia, Finland, France, Germany, Hungary, Italy, India, Israel, Japan, Rep. Korea, Netherlands, Norway, Pakistan, Poland, Portugal, Romania, Russia, Slovenia, Spain, Sweden, Switzerland, Taipei, Turkey, UK, Ukraine, USA. WLCG Collaboration Status Tier 0; 11 Tier 1s; 68 Tier 2 federations WLCG Collaboration Status Tier 0; 11 Tier 1s; 68 Tier 2 federations Amsterdam/NIKHEF-SARA Bologna/CNAF

15 The grid really works All sites, large and small can contribute – And their contributions are needed! Significant use of Tier 2s for analysis The grid really works All sites, large and small can contribute – And their contributions are needed! Significant use of Tier 2s for analysis Ian.Bird@cern.ch15 CPU – around the Tiers

16 Relies on – OPN, GEANT, US-LHCNet – NRENs & other national & international providers Ian Bird, CERN16 LHC Networking

17 CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it WLCG 2010-11 CPU corresponds to >>150k cores permanently running; Peak job loads – around 200k concurrent jobs In 2010 WLCG delivered ~100 CPU-millenia! CPU corresponds to >>150k cores permanently running; Peak job loads – around 200k concurrent jobs In 2010 WLCG delivered ~100 CPU-millenia! Data traffic in Tier 0 and to grid larger than 2010 values: Up to 4 GB/s from DAQs to tape Data traffic in Tier 0 and to grid larger than 2010 values: Up to 4 GB/s from DAQs to tape LHCb Compass CMS ATLAS AMS ALICE 2 PB Data to tape/month CPU delivered (HS06-hours/month) 17CERN IT Department - Frédéric Hemmer 1 M jobs/day CMS HI data zero suppression &  FNAL 2011 data  Tier 1s Re-processing 2010 data ALICE HI data  Tier 1s 2010 pp data  Tier 1s & re-processing enables the rapid delivery of physics results

18 Castor service at Tier 0 well adapted to the load: – Heavy Ions: more than 6 GB/s to tape (tests show that Castor can easily support >12 GB/s); Actual limit now is network from experiment to CC – Major improvements in tape efficiencies – tape writing at ~native drive speeds. Fewer drives needed – ALICE had x3 compression for raw data in HI runs WLCG: Data in 2011 HI: ALICE data into Castor > 4 GB/s (red) HI: Overall rates to tape > 6 GB/s (r+b) 23 PB data written in 2011

19 Since October 2012 Data Transfers for 2012… 2012 Data Already back to “normal” levels for accelerator running

20 Overall Use of WLCG 10 9 HEPSPEC-hours/month (~150 k CPU continuous use) 10 9 HEPSPEC-hours/month (~150 k CPU continuous use) 1.5M jobs/day Usage continues to grow even over end of year technical stop -# jobs/day -CPU usage Usage continues to grow even over end of year technical stop -# jobs/day -CPU usage

21 WLCG – no stop for computing Activity on 3 rd Jan

22 Continued growth in overall usage levels High levels of analysis use, particularly in preparation for winter conferences Resources  fully occupied Full reprocessing runs of total 2011 data samples – Achieved by end of the year HI: complete processing of 2011 samples Large simulation campaigns for 2011 data and in preparation for 8 TeV run Disk clean up campaigns in preparation for 2012 data Main Features of Recent Use:

23 Ian.Bird@cern.ch23 CERN & Tier 1 Accounting

24 Ian.Bird@cern.ch24 Use of T0 + T1 Resources Comparison between use per experiment and pledges -For Tier 0 alone -For sum of Tier 1s Comparison between use per experiment and pledges -For Tier 0 alone -For sum of Tier 1s Early in year, pledges start to be installed – can be used Tier 1 use – close to full Can make use of capacity share not used, esp. ALICE & LHCb Early in year, pledges start to be installed – can be used Tier 1 use – close to full Can make use of capacity share not used, esp. ALICE & LHCb

25 Ian.Bird@cern.ch25 Tier 2 accounting

26 Ian.Bird@cern.ch26 Tier 2 usage Tier 2 CPU delivered last 14 months – by country Comparison use & pledges

27 No real issue now Plots show “ops” reports Also published monthly are experiment- specific measured reliabilities: since Feb new report allows use of “arbitrary” experiment tests Ian.Bird@cern.ch27 Reliabilities

28 Fewer incidents in general But longer lasting (or most difficult to resolve) Q4 2011 all except 1 took >24 hr to resolve Ian.Bird@cern.ch28 Service incidents Time to resolution

29 Ian.Bird@cern.ch WLCG has been leveraged on both sides of the Atlantic, to benefit the wider scientific community – Europe: Enabling Grids for E-sciencE (EGEE) 2004-2010 European Grid Infrastructure (EGI) 2010-- – USA: Open Science Grid (OSG) 2006-2012 (+ extension?) Many scientific applications  29 Impact of the LHC Computing Grid Archeology Astronomy Astrophysics Civil Protection Comp. Chemistry Earth Sciences Finance Fusion Geophysics High Energy Physics Life Sciences Multimedia Material Sciences … Archeology Astronomy Astrophysics Civil Protection Comp. Chemistry Earth Sciences Finance Fusion Geophysics High Energy Physics Life Sciences Multimedia Material Sciences …

30 Spectrum of grids, clouds, supercomputers, etc. 30 Grids Collaborative environment Distributed resources (political/sociological) Commodity hardware (also supercomputers) (HEP) data management Complex interfaces (bug not feature) Supercomputers Expensive Low latency interconnects Applications peer reviewed Parallel/coupled applications Traditional interfaces (login) Also SC grids (DEISA, Teragrid) Clouds Proprietary (implementation) Economies of scale in management Commodity hardware Virtualisation for service provision and encapsulating application environment Details of physical resources hidden Simple interfaces (too simple?) Volunteer computing Simple mechanism to access millions CPUs Difficult if (much) data involved Control of environment  check Community building – people involved in Science Potential for huge amounts of real work Many different problems: Amenable to different solutions No right answer Many different problems: Amenable to different solutions No right answer Consider ALL as a combined e-Infrastructure ecosystem Aim for interoperability and combine the resources into a consistent whole Keep applications agile so they can operate in many environments

31 Grid: Is a distributed computing service – Integrates distributed resources – Global single-sign-on (use same credential everywhere) – Enables (virtual) collaboration Cloud: Is a large (remote) data centre – Economy of scale – centralize resources in large centres – Virtualisation – enables dynamic provisioning of resources Technologies are not exclusive – In the future our collaborative grid sites will use cloud technologies (virtualisation etc) – We will also use cloud resources to supplement our own Ian Bird, CERN31 Grid Cloud?? 26 June 2009

32 CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it The CERN Data Centre in Numbers Data Centre Operations (Tier 0) – 24x7 operator support and System Administration services to support 24x7 operation of all IT services. – Hardware installation & retirement ~7,000 hardware movements/year; ~1800 disk failures/year – Management and Automation framework for large scale Linux clusters High Speed Routers (640 Mbps → 2.4 Tbps) 24 Ethernet Switches350 10 Gbps ports2000 Switching Capacity4.8 Tbps 1 Gbps ports16,939 10 Gbps ports558 Racks828 Servers11,728 Processors15,694 Cores64,238 HEPSpec06482,507 Disks64,109 Raw disk capacity (TiB)63,289 Memory modules56,014 Memory capacity (TiB)158 RAID controllers3,749 Tape Drives160 Tape Cartridges45000 Tape slots56000 Tape Capacity (TiB)34000 IT Power Consumption2456 KW Total Power Consumption3890 KW

33 Consolidation of existing centre at CERN – Project ongoing to add additional critical power in the “barn” & consolidate UPS capacity – Scheduled to complete Oct 2012 Remote Tier 0 – Tendering completed, adjudication done in March Finance Committee – Wigner Inst., Budapest, Hungary selected – Anticipate Testing and first equipment installed in 2013 Production 2014 in time for end of LS1 – Will be true extension of Tier 0 LHCOne, IP connectivity direct from Budapest (not in 1 st years) Capacity to ramp up Use model – as dynamic as feasible (avoid pre-allocation of experiments or types of work) Ian.Bird@cern.ch33 Tier 0 Evolution

34 CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it Collaboration - Education CERN openlab – Intel, Oracle, Siemens, HP Networking http://cern.ch/openlab CERN School of Computing http://cern.ch/csc UNOSAT http://cern.ch/unosat EC Projects – EMI, EGI-Inspire, PARTNER, EULICE, OpenAire Citizen Cyber Science Collaboration – Involving the General Public


Download ppt "The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Emmanuel Tsesmelis 2 nd CERN School Thailand 2012 Suranaree University of Technology."

Similar presentations


Ads by Google