Grid Computing in HIGH ENERGY Physics

Slides:

Advertisements

Similar presentations

Computing for LHC Dr. Wolfgang von Rüden, CERN, Geneva ISEF students visit CERN, 28 th June - 1 st July 2009.

Advertisements

The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Dr Ian Bird LCG Project Leader 15 th April 2009 Visit of Spanish Royal Academy.

Les Les Robertson WLCG Project Leader WLCG – Worldwide LHC Computing Grid Where we are now & the Challenges of Real Data CHEP 2007 Victoria BC 3 September.

Project Status Report Ian Bird Computing Resource Review Board 30 th October 2012 CERN-RRB

Les Les Robertson LCG Project Leader LCG - The Worldwide LHC Computing Grid LHC Data Analysis Challenges for 100 Computing Centres in 20 Countries HEPiX.

Massive Computing at CERN and lessons learnt

LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.

LHC: An example of a Global Scientific Community Sergio Bertolucci CERN 5 th EGEE User Forum Uppsala, 14 th April 2010.

Frédéric Hemmer, CERN, IT DepartmentThe LHC Computing Grid – October 2006 LHC Computing and Grids Frédéric Hemmer IT Deputy Department Head October 10,

Computing for ILC experiment Computing Research Center, KEK Hiroyuki Matsunaga.

Frédéric Hemmer, CERN, IT Department The LHC Computing Grid – June 2006 The LHC Computing Grid Visit of the Comité d’avis pour les questions Scientifiques.

A short introduction to the Worldwide LHC Computing Grid Maarten Litmaath (CERN)

Petabyte-scale computing for LHC Ian Bird, CERN WLCG Project Leader WLCG Project Leader ISEF Students 18 th June 2012 Accelerating Science and Innovation.

The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Dr Ian Bird LCG Project Leader 25 th April 2012.

Ian Bird LCG Deployment Manager EGEE Operations Manager LCG - The Worldwide LHC Computing Grid Building a Service for LHC Data Analysis 22 September 2006.

EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE – paving the way for a sustainable infrastructure.

Progress in Computing Ian Bird ICHEP th July 2010, Paris

1 The LHC Computing Grid – February 2007 Frédéric Hemmer, CERN, IT Department LHC Computing and Grids Frédéric Hemmer Deputy IT Department Head January.

CERN IT Department CH-1211 Genève 23 Switzerland Visit of Professor Karel van der Toorn President University of Amsterdam Wednesday 10 th.

Jürgen Knobloch/CERN Slide 1 A Global Computer – the Grid Is Reality by Jürgen Knobloch October 31, 2007.

Ian Bird LHC Computing Grid Project Leader LHC Grid Fest 3 rd October 2008 A worldwide collaboration.

The LHC Computing Grid – February 2008 The Challenges of LHC Computing Dr Ian Bird LCG Project Leader 6 th October 2009 Telecom 2009 Youth Forum.

Les Les Robertson LCG Project Leader High Energy Physics using a worldwide computing grid Torino December 2005.

Ian Bird LCG Project Leader WLCG Update 6 th May, 2008 HEPiX – Spring 2008 CERN.

LHC Computing, CERN, & Federated Identities

The LHC Computing Grid – February 2008 The Challenges of LHC Computing Frédéric Hemmer IT Department 26 th January 2010 Visit of Michael Dell 1 Frédéric.

INFSO-RI Enabling Grids for E-sciencE The EGEE Project Owen Appleton EGEE Dissemination Officer CERN, Switzerland Danish Grid Forum.

The LHC Computing Grid Visit of Dr. John Marburger

WLCG Worldwide LHC Computing Grid Markus Schulz CERN-IT-GT August 2010 Openlab Summer Students.

1 The LHC Computing Grid – April 2007 Frédéric Hemmer, CERN, IT Department The LHC Computing Grid A World-Wide Computer Centre Frédéric Hemmer Deputy IT.

The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Dr Ian Bird LCG Project Leader 1 st March 2011 Visit of Dr Manuel Eduardo Baldeón.

WLCG after 1 year with data: Prospects for the future Ian Bird; WLCG Project Leader openlab BoS meeting CERN4 th May 2011.

WLCG Status Report Ian Bird Austrian Tier 2 Workshop 22 nd June, 2010.

WLCG: The 1 st year with data & looking to the future WLCG: Ian Bird, CERN WLCG Project Leader WLCG Project LeaderLCG-France; Strasbourg; 30 th May 2011.

Ian Bird LCG Project Leader WLCG Status Report 7 th May, 2008 LHCC Open Session.

Jürgen Knobloch/CERN Slide 1 Grid Computing by Jürgen Knobloch CERN IT-Department Presented at Physics at the Terascale DESY, Hamburg December 4, 2007.

WLCG – Status and Plans Ian Bird WLCG Project Leader openlab Board of Sponsors CERN, 23 rd April 2010.

Top 5 Experiment Issues ExperimentALICEATLASCMSLHCb Issue #1xrootd- CASTOR2 functionality & performance Data Access from T1 MSS Issue.

Dr. Ian Bird LHC Computing Grid Project Leader Göttingen Tier 2 Inauguration 13 th May 2008 Challenges and Opportunities.

LHC collisions rate: Hz New PHYSICS rate: Hz Event selection: 1 in 10,000,000,000,000 Signal/Noise: Raw Data volumes produced.

The CMS Experiment at LHC

WLCG Tier-2 Asia Workshop TIFR, Mumbai 1-3 December 2006

Ian Bird WLCG Workshop San Francisco, 8th October 2016

LCG Service Challenge: Planning and Milestones

Physics Data Management at CERN

Grid site as a tool for data processing and data analysis

Kors Bos NIKHEF, Amsterdam.

IT Department and The LHC Computing Grid

The LHC Computing Grid Visit of Mtro. Enrique Agüera Ibañez

Collaboration Meeting

Project Status Report Computing Resource Review Board Ian Bird

The Worldwide LHC Computing Grid BEGrid Seminar

Long-term Grid Sustainability

The LHC Computing Challenge

Dagmar Adamova, NPI AS CR Prague/Rez

EGEE support for HEP and other applications

The LHC Computing Grid Visit of Her Royal Highness

Dagmar Adamova (NPI AS CR Prague/Rez) and Maarten Litmaath (CERN)

Connecting the European Grid Infrastructure to Research Communities

Input on Sustainability

Visit of US House of Representatives Committee on Appropriations

EGI – Organisation overview and outreach

LHC Data Analysis using a worldwide computing grid

Cécile Germain-Renaud Grid Observatory meeting 19 October 2007 Orsay

Collaboration Board Meeting

The LHC Grid Service A worldwide collaboration Ian Bird

The LHC Computing Grid Visit of Prof. Friedrich Wagner

Overview & Status Al-Ain, UAE November 2007.

The LHC Computing Grid Visit of Professor Andreas Demetriou

The LHCb Computing Data Challenge DC06

Presentation transcript:

Grid Computing in HIGH ENERGY Physics Challenges and Opportunities Dr. Ian Bird LHC Computing Grid Project Leader Göttingen Tier 2 Inauguration 13th May 2008

The scales Ian.Bird@cern.ch

High Energy Physics machines and detectors pp @ √s=14 TeV L : 1034/cm2/s L: 2.1032 /cm2/s Chambres à muons Calorimètre - 2,5 million collisions per second LVL1: 10 KHz, LVL3: 50-100 Hz 25 MB/sec digitized recording 40 million collisions per second LVL1: 1 kHz, LVL3: 100 Hz 0.1 to 1 GB/sec digitized recording Ian.Bird@cern.ch

LHC: 4 experiments … ready! First physics expected in autumn 2008 Is the computing ready ? Ian.Bird@cern.ch

The LHC Computing Challenge Signal/Noise: 10-9 Data volume High rate * large number of channels * 4 experiments  15 PetaBytes of new data each year Compute power Event complexity * Nb. events * thousands users  100 k of (today's) fastest CPUs Worldwide analysis & funding Computing funding locally in major regions & countries Efficient analysis everywhere  GRID technology Ian.Bird@cern.ch

A collision at LHC Luminosity : 1034cm-2 s-1 40 MHz – every 25 ns 20 events overlaying Ian.Bird@cern.ch

The Data Acquisition Ian.Bird@cern.ch

Tier 0 at CERN: Acquisition, First pass reconstruction, Storage & Distribution I would stress the continuous operations over several months and the link between collisions, experiments and computing (and the fact that the reliability of the computing could -if not good enough- impact the daq and ultimately the quality of science. This will bring you nicely onto foil 10. 1.25 GB/sec (ions) Ian.Bird@cern.ch

Tier 0 – Tier 1 – Tier 2 Tier-0 (CERN): Data recording First-pass reconstruction Data distribution Tier-1 (11 centres): Permanent storage Re-processing Analysis Tier-2 (>200 centres): Simulation End-user analysis Ian.Bird@cern.ch

Evolution of requirements ATLAS (or CMS) requirements for first year at design luminosity ATLAS&CMS CTP 107 MIPS 100 TB disk LHC start LHC approved “Hoffmann” Review 7x107 MIPS 1,900 TB disk Computing TDRs 55x107 MIPS 70,000 TB disk (140 MSi2K) 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 LHCb approved ATLAS & CMS approved ALICE approved Ian.Bird@cern.ch

Evolution of CPU Capacity at CERN Tape & disk requirements: >10 times CERN possibility SC (0.6GeV) PS (28GeV) ISR (300GeV) SPS (400GeV) ppbar (540GeV) LEP (100GeV) LEP II (200GeV) LHC (14 TeV) Costs (2007 Swiss Francs) Includes infrastructure costs (comp.centre, power, cooling, ..) and physics tapes

Evolution of Grids WLCG GriPhyN, iVDGL, PPDG GRID 3 OSG EU DataGrid EGEE 1 EGEE 2 EGEE 3 LCG 1 LCG 2 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 Service Challenges Cosmics First physics Data Challenges Ian.Bird@cern.ch

The Worldwide LHC Computing Grid Purpose Develop, build and maintain a distributed computing environment for the storage and analysis of data from the four LHC experiments Ensure the computing service … and common application libraries and tools Phase I – 2002-05 - Development & planning Phase II – 2006-2008 – Deployment & commissioning of the initial services Ian.Bird@cern.ch

WLCG Collaboration MoU Signing Status The Collaboration Tier 1 – all have now signed Tier 2: MoU Signing Status The Collaboration 4 LHC experiments ~250 computing centres 12 large centres (Tier-0, Tier-1) 56 federations of smaller “Tier-2” centres Growing to ~40 countries Grids: EGEE, OSG, Nordugrid Technical Design Reports WLCG, 4 Experiments: June 2005 Memorandum of Understanding Agreed in October 2005 Resources 5-year forward look Australia Belgium Canada * China Czech Rep. * Denmark Estonia Finland France Germany (*) Hungary * Italy India Israel Japan JINR Korea Netherlands Norway * Pakistan Poland Potugal Romania Russia Slovenia Spain Sweden * Switzerland Taipei Turkey * UK Ukraine USA Still to sign: Austria Brazil (under discussion) * Recent additions Ian.Bird@cern.ch

WLCG Service Hierarchy Tier-0 – the accelerator centre Data acquisition & initial processing Long-term data curation Distribution of data  Tier-1 centres Canada – Triumf (Vancouver) France – IN2P3 (Lyon) Germany – Forschunszentrum Karlsruhe Italy – CNAF (Bologna) Netherlands – NIKHEF/SARA (Amsterdam) Nordic countries – distributed Tier-1 Spain – PIC (Barcelona) Taiwan – Academia SInica (Taipei) UK – CLRC (Oxford) US – FermiLab (Illinois) – Brookhaven (NY) Tier-1 – “online” to the data acquisition process  high availability Managed Mass Storage –  grid-enabled data service Data-heavy analysis National, regional support Tier-2: ~130 centres in ~35 countries End-user (physicist, research group) analysis – where the discoveries are made Simulation Ian.Bird@cern.ch

Recent grid use Across all grid infrastructures EGEE, OSG, Nordugrid The grid concept really works – all contributions – large & small are essential! CERN: 11% Tier 2: 54% Tier 1: 35%

Recent grid activity WLCG ran ~ 44 M jobs in 2007 – workload has continued to increase 29M in 2008 – now at ~ >300k jobs/day Distribution of work across Tier0/Tier1/Tier 2 really illustrates the importance of the grid system Tier 2 contribution is around 50%; > 85% is external to CERN 300k /day 230k /day These workloads (reported across all WLCG centres) are at the level anticipated for 2008 data taking

LHCOPN Architecture Ian.Bird@cern.ch

Data Transfer out of Tier-0 Target: 2008/2009 1.3 GB/s Ian.Bird@cern.ch

Production Grids WLCG relies on a production quality infrastructure Requires standards of: Availability/reliability Performance Manageability Will be used 365 days a year ... (has been for several years!) Tier 1s must store the data for at least the lifetime of the LHC - ~20 years Not passive – requires active migration to newer media Vital that we build a fault-tolerant and reliable system That can deal with individual sites being down and recover Ian.Bird@cern.ch

The EGEE Production Infrastructure Operations Coordination Centre Regional Operations Centres Global Grid User Support EGEE Network Operations Centre (SA2) Operational Security Coordination Team Support Structures & Processes Production Service Pre-production service Certification test-beds (SA3) Test-beds & Services Training infrastructure (NA4) Training activities (NA3) Operations Advisory Group (+NA4) Joint Security Policy Group EuGridPMA (& IGTF) Grid Security Vulnerability Group Security & Policy Groups

Site Reliability Sep 07 Oct 07 Nov 07 Dec 07 Jan 08 Feb 08 All 89% 86% 92% 87% 84% 8 best 93% 95% 96% Above target (+>90% target) 7 + 2 5 + 4 9 + 2 6 + 4 7 + 3 Ian.Bird@cern.ch

Improving Reliability Monitoring Metrics Workshops Data challenges Experience Systematic problem analysis Priority from software developers

Gridmap

Middleware: Baseline Services The Basic Baseline Services – from the TDR (2005) Storage Element Castor, dCache, DPM Storm added in 2007 SRM 2.2 – deployed in production – Dec 2007 Basic transfer tools – Gridftp, .. File Transfer Service (FTS) LCG File Catalog (LFC) LCG data mgt tools - lcg-utils Posix I/O – Grid File Access Library (GFAL) Synchronised databases T0T1s 3D project Information System Scalability improvements Compute Elements Globus/Condor-C – improvements to LCG-CE for scale/reliability web services (CREAM) Support for multi-user pilot jobs (glexec, SCAS) gLite Workload Management in production VO Management System (VOMS) VO Boxes Application software installation Job Monitoring Tools Focus now on continuing evolution of reliability, performance, functionality, requirements For a production grid the middleware must allow us to build fault-tolerant and scalable services: this is more important than sophisticated functionality

Database replication In full production Several GB/day user data can be sustained to all Tier 1s ~100 DB nodes at CERN and several 10’s of nodes at Tier 1 sites Very large distributed database deployment Used for several applications Experiment calibration data; replicating (central, read-only) file catalogues

LCG depends on two major science grid infrastructures …. EGEE - Enabling Grids for E-Science OSG - US Open Science Grid Interoperability & interoperation is vital significant effort in building the procedures to support it

Grid infrastructure project co-funded by the European Commission - now in 2nd phase with 91 partners in 32 countries 240 sites 45 countries 45,000 CPUs 12 PetaBytes > 5000 users > 100 VOs > 100,000 jobs/day Archeology Astronomy Astrophysics Civil Protection Comp. Chemistry Earth Sciences Finance Fusion Geophysics High Energy Physics Life Sciences Multimedia Material Sciences …

EGEE: Increasing workloads ⅓ non-LHC

Grid Applications Medical Seismology Chemistry Astronomy Particle Physics Fusion Ian.Bird@cern.ch

Share of EGEE resources HEP 5/07 – 4/08: 45 Million jobs Ian.Bird@cern.ch

HEP use of EGEE: May 07 – Apr 08 Ian.Bird@cern.ch

The next step

Sustainability: Beyond EGEE-II Need to prepare permanent, common Grid infrastructure Ensure the long-term sustainability of the European e-infrastructure independent of short project funding cycles Coordinate the integration and interaction between National Grid Infrastructures (NGIs) Operate the European level of the production Grid infrastructure for a wide range of scientific disciplines to link NGIs Expand the idea and problems of the JRU

EGI – European Grid Initiative www.eu-egi.org EGI Design Study proposal to the European Commission (started Sept 07) Supported by 37 National Grid Initiatives (NGIs) 2 year project to prepare the setup and operation of a new organizational model for a sustainable pan-European grid infrastructure after the end of EGEE-3

Summary We have an operating production quality grid infrastructure that: Is in continuous use by all 4 experiments (and many other applications); Is still growing in size – sites, resources (and still to finish ramp up for LHC start-up); Demonstrates interoperability (and interoperation!) between 3 different grid infrastructures (EGEE, OSG, Nordugrid); Is becoming more and more reliable; Is ready for LHC start up For the future we must: Learn how to reduce the effort required for operation; Tackle upcoming issues of infrastructure (e.g. Power, cooling); Manage migration of underlying infrastructures to longer term models; Be ready to adapt the WLCG service to new ways of doing distributed computing Ian.Bird@cern.ch

Ian.Bird@cern.ch