12 July 2004 - 1 Experience Deploying the Large Hadron Collider Computing Grid Markus Schulz CERN IT GD-GIS

Slides:



Advertisements
Similar presentations
1 User Analysis Workgroup Update  All four experiments gave input by mid December  ALICE by document and links  Very independent.
Advertisements

Andrew McNab - EDG Access Control - 14 Jan 2003 EU DataGrid security with GSI and Globus Andrew McNab University of Manchester
RLS Production Services Maria Girone PPARC-LCG, CERN LCG-POOL and IT-DB Physics Services 10 th GridPP Meeting, CERN, 3 rd June What is the RLS -
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
The LHC Computing Grid Project Tomi Kauppi Timo Larjo.
25 May LCG Deployment Status Oliver Keeble CERN IT GD-GIS
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
RomeWorkshop on eInfrastructures 9 December LCG Progress on Policies & Coming Challenges Ian Bird IT Division, CERN LCG and EGEE Rome 9 December.
GRACE Project IST EGAAP meeting – Den Haag, 25/11/2004 Giuseppe Sisto – Telecom Italia Lab.
5 November 2001F Harris GridPP Edinburgh 1 WP8 status for validating Testbed1 and middleware F Harris(LHCb/Oxford)
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
HEP Experiment Integration within GriPhyN/PPDG/iVDGL Rick Cavanaugh University of Florida DataTAG/WP4 Meeting 23 May, 2002.
INFSO-RI Enabling Grids for E-sciencE The US Federation Miron Livny Computer Sciences Department University of Wisconsin – Madison.
CERN LCG-1 Status and Issues Ian Neilson for LCG Deployment Group CERN Hepix 2003, Vancouver.
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
LCG and HEPiX Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002.
UK DTI Mission – 29 June Grid Deployment Ian Bird LCG Deployment Area Manager & EGEE Operations Manager IT Department, CERN Presentation to UK.
GGF12 – 20 Sept LCG Incident Response Ian Neilson LCG Security Officer Grid Deployment Group CERN.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
IST E-infrastructure shared between Europe and Latin America High Energy Physics Applications in EELA Raquel Pezoa Universidad.
DOSAR Workshop, Sao Paulo, Brazil, September 16-17, 2005 LCG Tier 2 and DOSAR Pat Skubic OU.
ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.
10/24/2015OSG at CANS1 Open Science Grid Ruth Pordes Fermilab
SouthGrid SouthGrid SouthGrid is a distributed Tier 2 centre, one of four setup in the UK as part of the GridPP project. SouthGrid.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
Responsibilities of ROC and CIC in EGEE infrastructure A.Kryukov, SINP MSU, CIC Manager Yu.Lazin, IHEP, ROC Manager
GridPP Deployment & Operations GridPP has built a Computing Grid of more than 5,000 CPUs, with equipment based at many of the particle physics centres.
D0RACE: Testbed Session Lee Lueking D0 Remote Analysis Workshop February 12, 2002.
LCG LHC Computing Grid Project – LCG CERN – European Organisation for Nuclear Research Geneva, Switzerland LCG LHCC Comprehensive.
CERN LCG Deployment Overview Ian Bird CERN IT/GD LHCC Comprehensive Review November 2003.
Owen SyngeTitle of TalkSlide 1 Storage Management Owen Synge – Developer, Packager, and first line support to System Administrators. Talks Scope –GridPP.
INFSO-RI Enabling Grids for E-sciencE OSG-LCG Interoperability Activity Author: Laurence Field (CERN)
…building the next IT revolution From Web to Grid…
Les Les Robertson LCG Project Leader High Energy Physics using a worldwide computing grid Torino December 2005.
15-Dec-04D.P.Kelsey, LCG-GDB-Security1 LCG/GDB Security Update (Report from the Joint Security Policy Group) CERN 15 December 2004 David Kelsey CCLRC/RAL,
Ian Bird LCG Deployment Area Manager & EGEE Operations Manager IT Department, CERN Presentation to HEPiX 22 nd October 2004 LCG Operations.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
INFSO-RI Enabling Grids for E-sciencE EGEE SA1 in EGEE-II – Overview Ian Bird IT Department CERN, Switzerland EGEE.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
WP3 Information and Monitoring Rob Byrom / WP3
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
Plans for Service Challenge 3 Ian Bird LHCC Referees Meeting 27 th June 2005.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
David Foster LCG Project 12-March-02 Fabric Automation The Challenge of LHC Scale Fabrics LHC Computing Grid Workshop David Foster 12 th March 2002.
CERN LCG Deployment Overview Ian Bird CERN IT/GD LCG Internal Review November 2003.
INFSO-RI Enabling Grids for E-sciencE The EGEE Project Owen Appleton EGEE Dissemination Officer CERN, Switzerland Danish Grid Forum.
DataGrid is a project funded by the European Commission under contract IST rd EU Review – 19-20/02/2004 The EU DataGrid Project Three years.
INFSO-RI Enabling Grids for E-sciencE gLite Certification and Deployment Process Markus Schulz, SA1, CERN EGEE 1 st EU Review 9-11/02/2005.
Distributed Analysis Tutorial Dietrich Liko. Overview  Three grid flavors in ATLAS EGEE OSG Nordugrid  Distributed Analysis Activities GANGA/LCG PANDA/OSG.
LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.
CERN Certification & Testing LCG Certification & Testing Team (C&T Team) Marco Serra - CERN / INFN Zdenek Sekera - CERN.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
INFSO-RI Enabling Grids for E-sciencE EGEE general project update Fotis Karayannis EGEE South East Europe Project Management Board.
CERN LCG1 to LCG2 Transition Markus Schulz LCG Workshop March 2004.
Bob Jones EGEE Technical Director
Regional Operations Centres Core infrastructure Centres
EGEE Middleware Activities Overview
JRA3 Introduction Åke Edlund EGEE Security Head
Ian Bird GDB Meeting CERN 9 September 2003
Grid Deployment Area Status Report
Operating the LCG and EGEE Production Grid for HEP
LCG experience in Integrating Grid Toolkits
Leigh Grundhoefer Indiana University
Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002
LHC Data Analysis using a worldwide computing grid
Report on GLUE activities 5th EU-DataGRID Conference
Presentation transcript:

12 July Experience Deploying the Large Hadron Collider Computing Grid Markus Schulz CERN IT GD-GIS

12 July Overview  LHC Computing Grid  CERN  Challenges for LCG  The project  Deployment and Status  LCG-EGEE  Summary Large Hadron Collider

12 July CERN CERN European Organization for Nuclear Research located on the Swiss - French border close to Geneva funded by 20 european member states In addition several observer states and non member states participating in the experiment program world’s largest center for particle physics research provides infrastructure and tools (accelerators etc.) ~3000 employees and several thousand visiting scientists 50 years of history (several Nobel Prizes) The place where the WWW was born (1990) Next milestone: the LHC (Large Hadron Collider)

12 July Challenges for the LHC Computing Grid Challenges for the LHC Computing Grid LHC (Large Hadron Collider) with 27 km of magnets the largest superconducting installation proton beams collide at an energy of 14TeV 40 million events per second from each of the 4 experiments after triggers and filters MBytes/second remain every year ~15PetaByte of data will be recorded this data has to be reconstructed and analyzed by the users in addition large computational effort to produce Monte Carlo data

12 July The Four Experiments at LHC LHCb Federico.carminati, EU review presentation

12 July PetaByte/year 15 PetaByte/year have to be: Recorded Cataloged Managed Distributed Processed 50 CD-ROM = 35 GB 6 cm Concorde (15 Km) Balloon (30 Km) CD stack with 1 year LHC data! (~ 20 Km) Mt. Blanc (4.8 Km)

12 July Core Tasks Core Tasks Reconstruction: transform signals from the detector to physical properties –energy, charge, tracks, momentum, particle id. –this task is computational intensive and has modest I/O requirements –structured activity (production manager) Simulation: start from the theory and compute the responds of the detector –very computational intensive –structured activity, but larger number of parallel activities Analysis: complex algorithms, search for similar structures to extract physics –very I/O intensive, large number of files involved –access to data cannot be effectively coordinated –iterative, parallel activities of hundreds of physicists

12 July Computing Needs Computing Needs Some 100 Million SPECInt2000 are needed A 3 GHz Pentium IV ~ 1K SPECInt2000 O(100k) CPUs are needed

12 July Large and distributed user community Large and distributed user community CERN Collaborators > 6000 users from 450 institutes none has all the required computing all have access to some computing Europe: 267 institutes 4603 users Elsewhere: 208 institutes 1632 users Solution: Connect all the resources into a computing grid

12 July The LCG Project (and what it isn’t) Mission To prepare, deploy and operate the computing environment for the experiments to analyze the data from the LHC detectors Two phases: Phase 1: 2002 – 2005 Build a prototype, based on existing grid middleware Deploy and run a production service Produce the Technical Design Report for the final system Phase 2: 2006 – 2008 Build and commission the initial LHC computing environment LCG is NOT a development project for middleware but problem fixing is permitted (even if writing code is required)

12 July LCG Time Line Testing, with simulated event productions first data physicscomputing service open LCG-1 – 15 Sept * TDR – technical design report Computing models TDR for the Phase 2 grid experiment setup & preparation Phase 2 service in production Phase 2 service acquisition, installation, commissioning principal service for LHC data challenges LCG-3 – second generation middleware validation of computing models Second generation middleware prototyping, development LCG-2 - upgraded middleware, mgt. and ops tools

12 July RAL IN2P3 BNL FZK CNAF PIC ICEPP FNAL USC NIKHEF Krakow CIEMAT Rome Taipei TRIUMF CSCS Legnaro UB IFCA IC MSU Prague Budapest Cambridge Tier-1 small centres Tier-2 desktops portables LCG Scale and Computing Model Tier-0 reconstruct (ESD) record raw and ESD distribute data to tier-1 Tier-1 data heavy analysis permanent, managed grid-enabled storage (raw, analysis, ESD), MSS reprocessing regional support Tier-2 managed disk storage simulation end user analysis parallel interactive analysis Data distribution ~70Gbits/sec

12 July LCG - a Collaboration Building and operating the LHC Grid - a collaboration between The physicists and computing specialists from the experiments The projects in Europe and the US that have been developing Grid middleware European DataGrid (EDG) US Virtual Data Toolkit (Globus, Condor, PPDG, iVDGL, GriPhyN) The regional and national computing centres that provide resources for LHC some contribution from HP (tier2 centers) The research networks Researchers Software Engineers Service Providers

12 July LCG-2 Software LCG-2_1_0 core packages: VDT (Globus2) EDG WP1 (Resource Broker) EDG WP2 (Replica Management tools) One central RMC and LRC for each VO, located at CERN, ORACLE backend Several bits from other WPs (Config objects, InfoProviders, Packaging…) GLUE 1.1 (Information schema) + few essential LCG extensions MDS based Information System with LCG enhancements Almost all components have gone through some reengineering robustness scalability efficiency adaptation to local fabrics

12 July LCG-2 Software Authentication and Authorization Globus GSI based on X509 certificates LCG established trust relationship between the CAs in the project Virtual Organizations (VOs) registration hosted at different sites Data Management Tools Catalogues keep track of replicas (Replica Metadata Catalog, Local Replica Catalog) SRM interface for several HMSS and disk pools Wide area transport via GridFTP RLI LFN-1 LFN-2 LFN-3 GUID RMC SURL-1 LRC SURL-2 lfn:Toto.psNot4Humans srm://host.domain/path/file

12 July LCG-2 Software Information System Globus MDS based for information gathering on a site LDAP + lightweight DB based system for collecting data from sites LCG-BDII solved scalability problem of Globus2 MDS (>200sites tested) Contains information on capacity, capability, utilization and state of services (computing, storage, cataloges..) Work Load Data Management Tools Matches user requests with the resources available for a VO requirements formulated in JDL (classadds) user tunable ranking of resources Uses RLS and information system Keeps state of jobs and manages extension of credentials input output sandboxes proxy renewal…. Interfaces to local batch systems via a gateway node

12 July LCG-2 Software Job Status submitted waiting ready scheduled running done cleared UI Replica Catalog Inform. Service Network Server Job Contr. - CondorG Workload Manager RB node CE characts & status SE characts & status RB storage Match- Maker/ Broker Job Adapter Log Monitor Logging & Bookkeeping sandbox Matching Job Adapter On CE Processed Output back User done Arrived on RB Input Sandbox is what you take with you to the node Output Sandbox is what you get back Failed jobs are resubmitted

12 July LCG Grid Deployment Area  Goal: - deploy & operate a prototype LHC computing environment  Scope: Integrate a set of middleware and coordinate and support its deployment to the regional centres Provide operational services to enable running as a production-quality service Provide assistance to the experiments in integrating their software and deploying in LCG; Provide direct user support  Deployment Goals for LCG-2 Production service for Data Challenges in 2004 Initially focused on batch production work Experience in close collaboration between the Regional Centres Learn how to maintain and operate a global grid Focus on building a production-quality service Focus on robustness, fault-tolerance, predictability, and supportability Understand how LCG can be integrated into the sites’ physics computing services

12 July Deployment Area Manager Deployment Area Manager Grid Deployment Board Grid Deployment Board Certification Team Certification Team Deployment Team Deployment Team Experiment Integration Team Experiment Integration Team Testing group Security group Security group Storage group Storage group GDB task forces JTB HEPiX GGF Grid Projects: EDG, Trillium, Grid3, etc Regional Centres LHC Experiments LCG Deployment Area LCG Deployment Organisation and Collaborations Operations Centres - RAL Operations Centres - RAL Call Centres - FZK Call Centres - FZK Advises, informs, Sets policy Set requirements Collaborative activities participate

12 July Implementation  A core team at CERN – Grid Deployment group (~30)  Collaboration of the regional centres – through the Grid Deployment Board  Partners take responsibility for specific tasks (e.g. GOCs, GUS)  Focussed task forces as needed  Collaborative joint projects – via JTB, grid projects, etc. CERN deployment group Core preparation, certification, deployment, and support activities Integration, packaging, debugging, development of missing tools, Deployment coordination & support, security & VO management, Experiment integration and support GDB: Country representatives for regional centres Address policy, operational issues that require general agreement Brokered agreements on: Initial shape of LCG-1 via 5 working groups Security What is deployed

12 July Operations Services Operations Service: RAL (UK) is leading sub-project on developing operations services Initial prototype Basic monitoring tools Mail lists for problem resolution GDB database containing contact and site information Working on defining policies for operation, responsibilities (draft document) Working on grid wide accounting Monitoring: GridICE (development of DataTag Nagios-based tools) GridPP job submission monitoring Many more like Deployment and operation support: Hierarchical model CERN acts as 1 st level support for the Tier 1 centres Tier 1 centres provide 1 st level support for associated Tier 2s

12 July User Support Central model for user support VOs provide 1st level triage FZK (germany) leading sub-project to develop user support services Web portal for problem reporting Experiments contacts send problems through the FZK portal During the data challenges the experiments used a direct channel via the GD teams. Experiment integration support by CERN based group close collaboration during data challenges Documentation Installation guides (manual and management tool based) See: bin/index.cgi?var=releaseshttp://grid-deployment.web.cern.ch/grid-deployment/cgi- bin/index.cgi?var=releases Rather comprehensive user guides

12 July Security LCG Security Group LCG1 usage rules (still used by LCG2) Registration procedures and VO management Agreement to collect only minimal amount of personal data Initial audit requirements are defined Initial incident response procedures Site security contacts etc. are defined Set of trusted CAs (including Fermilab online KCA) Draft of security policy (to be finished by end of year) Web site security/ security/

12 July Certification and release cycles Developers CERTIFICATION TESTING SERVICES Integrate Basic Functionality Tests Run tests C&T suites Site suites Run Certification Matrix Release candidate tag PRE-PRODUCTION PRODUCTION APP INTEGR Certified release tag DEVELOPMENT & INTEGRATION UNIT & FUNCTIONAL TESTING Dev Tag HEP EXPTS BIO-MED OTHER TBD APPS SW Installation DEPLOYMENT PREPARATION Deployment release tag DEPLOY Production tag

12 July Milestones Project Level 1 Deployment milestones for 2003:  July: Introduce the initial publicly available LCG-1 global grid service With 10 Tier 1 centres in 3 continents  November: Expanded LCG-1 service with resources and functionality sufficient for the 2004 Computing Data Challenges Additional Tier 1 centres, several Tier 2 centres – more countries Expanded resources at Tier 1s Agreed performance and reliability targets “LHC Data Challenges”2004 the “LHC Data Challenges” Large-scale tests of the experiments’ computing models, processing chains, grid technology readiness, operating infrastructure ALICE and CMS data challenges started at the beginning of March LHCb and ATLAS – started in May/June The big challenge for this year - data – - file catalogue, (million of files) - replica management, - database access, - integrating all available mass storage systems (several hundred TByte)

12 July History Jan 2003 GDB agreed to take VDT and EDG components March 2003 LCG-0 existing middleware, waiting for EDG-2 release September 2003 LCG-1 3 month late -> reduced functionality extensive certification process improved stability (RB, Information system) integrated 32 sites ~300 CPUs operated until early January, first use for production December 2003 LCG-2 Full set of functionality for DCs, but only “classic SE”, first MSS integration Deployed in January, Data challenges started in February -> testing in production Large sites integrate resources into LCG (MSS and farms) Mai > now releases with incrementaly: Improved services SRM enabled storage for disk and MSS systems

12 July LCG1 Experience (2003) Integrate sites and operate a grid Problems: only 60% of personnel, software was late introduced hierarchical support model (primary and secondary sites) worked well for some regions (less for others) installation and configuration was an issue only time to package software for the LCFGng tool (problematic) not sufficient documentation (partially compensated by travel) manual installation procedure documented when new staff arrived communication/cooperation between sites needed to be established deployed MDS + EDG-BDII in a robust way redundant regional GIISes vastly improved the scalability and robustness of the information system upgrades, especially non backward compatible ones took very long not all sites showed the same dedication still some problems with the reliability of some of the core services Big step forward

12 July LCG2 productionOperate large scale production service Started with 8 “core” sites each bringing significant resources sufficient experience to react quickly weekly core site phone conference weekly meeting with each of the experiments weekly joined meeting of the sites and the experiments (GDA) Introduced a testZone for new sites LCG1 showed that ill configured sites can affect all sites sites stay in the testZone until they have been stable for some time Further improved (simplified) information system addresses: manageability, improves: robustness and scalability allows partitioning of the grid into independent views Introduced local testbed for experiment integration rapid feedback on functionality from the experiments triggered several changes to the RLS system

12 July LCG2 Focus on integrating local resources batch systems at CNAF, CERN, NIKHEF already integrated MSS systems with CASTOR at several sites, enstore at FNAL Experiment software distribution mechanism based on shared file system with access for privileged users tool to publish the installed software in the information system needs to be as transparent as possible. (Some work to be done) Improved documentation and installation sites have the choice to use LCFGng or follow a manual installation guide generic config. description eases integration with local tools documentation includes simple tests, sites join in a better state improved readability by going to HTML and PDF Release page

12 July

12 July Sites in LCG2 July Countries 63 Sites 49 Europe, 2 US, 5 Canada, 6 Asia, 1 HP Coming: New Zealand, China, Korea other HP (Brazil, Singapore) 6100 cpu

12 July Usage Hard to measure: VOs “Pick” services and add own components for job submission, file catalogs, replication … we have no central control of the resources accounting has to be improved File catalogues (only used by 2 VOs ) ~ 2.5 Million entries

12 July Integrating Site Resources The plan: Provide defined grid interfaces to a grid site: Storage, compute clusters, etc. Integration with local systems is site responsibility Middleware layered over existing system installations But (real life): Interfaces are not well defined (SRM maybe a first?) Lots of small sites require a packaged solution Including fabric management (disk pool managers, batch systems) That installs magically out of the box Strive for the first view, while providing the latter But – “some assembly is required” – it costs effort Constraints: Packaging and installation integrated with some middleware Complex dependencies of middleware packages Current software requires that many holes are punched into the firewalls of the sites

12 July Integrating Site Resources Adding Sites: Site contacts the deployment team or T1 centre Deployment team send form for contact DB and points site to the release page Site decides after consultation on scope and method of installation Site installs, problems are resolved via mailing list and T1s intervention Site runs initial certification tests (provided with installation guides) Site is added to the testZone information system Deployment team runs certification jobs and helps the site to fix problems Tests are repeated and the status is published (GIIS) and (Status)GIISStatus internal web based tool to follow history VOs add stable sites to their RBs Sites are added to the productionZone Most frequent problems: Missing or wrong localization of the configuration firewalls are not configured correctly

12 July Data Challenges LHC experiments use multiple grids and additional services Integration, Interoperability expect some central resource negotiation concerning: –queue length, memory,scratch space, storage etc. Service provision planned to provide shared services RBs, BDIIs, UIs etc experiments need to augment the services on the UI and need to define their super/subsets of the grid –individual RB/BDII/UIs for each VO (optional on one node) Scalable services for data transport needed DNS switched access to gridFTP Performance issues with several tools (RLS, Info System, RBs) most understood, work around and fixes implemented and part of 2_1_0 Local Resource Managers (batch systems) are too smart for Glue GlueSchema can’t express the richness of batch systems (LSF etc.) –load distribution not understandable for users (looks wrong) –problem is understood and workaround in preperation

12 July Interoperability Several grid infrastructures for LHC experiments: LCG-2/EGEE, Grid2003/OSG, NorduGrid, other national grids LCG/EGEE explicit goals to interoperate One of LCG service challenges Joint projects on storage elements, file catalogues, VO management, etc. Most are VDT (or at least Globus-based) Grid2003 & LCG use GLUE schema Issues are: File catalogues, information schema, etc at technical level Policy and semantic issues

12 July Developments in 2004 General: LCG-2 will be the service run in 2004 – aim to evolve incrementally Goal is to run a stable service Service challenges (data transport (500 MB/s one week), jobs, interoperability) Some functional improvements: Extend access to MSS – tape systems, and managed disk pools Distributed vs replicated replica catalogs – with Oracle back-ends To avoid reliance on single service instances Operational improvements: Monitoring systems – move towards proactive problem finding, ability to take sites on/offline; experiment monitoring (R-GMA), accounting Control system “Cookbook” to cover planning, installation and operation Activate regional centres to provide and support services this has improved over time, but in general there is too little sharing of tasks Address integration issues: With large clusters (on non routed networks), with storage systems, with different OSs, Integrating with other experiments and apps

12 July Changing landscape The view of grid environments has changed in the past year From A view where all LHC sites would run a consistent and identical set of middleware, To A view where large sites must support many experiments each of which have grid requirements National grid infrastructures are coming – catering to many applications, and not necessarily driven by HEP requirements We have to focus on interoperating between potentially diverse infrastructures (“grid federations”) At the moment these have underlying same m/w But modes of use and policies are different (IS, file catalogues,..) Need to have agreed services, interfaces, protocols The situation is now more complex than anticipated

12/ LCG – EGEE LCG-2 will be the production service during 2004 Will also form basis of EGEE initial production service Will be maintained as a stable service Will continue to be developed Expect in parallel a development service – Q204 Based on EGEE middleware prototypes Run as a service on a subset of EGEE/LCG production sites The core infrastructure of the LCG and EGEE grids will be operated as a single service LCG includes US and Asia, EGEE includes other sciences The ROCs support Resource Centres and applications Similar to LCG primary sites Some ROCs and LCG primary sites will be merged LCG Deployment Manager will be the EGEE Operations Manager Will be member of PEB of both projects EGEE LCG geographical applications

12/ Summary Deplyment of Grid services

12/ Summary 2003 – first MC event production In 2004 we must show that we can handle the data – supporting simple computing models -- This is the key goal of the 2004 Data Challenges Target for end of this year – Basic model demonstrated using current grid middleware All Tier-1s and ~25% of Tier-2s operating a reliable service Validate security model, understand storage model Clear idea of the performance, scaling, operations and management issues first data Initial service in operation Decisions on final core middleware Demonstrat e core data handling and batch analysis Installation and commissioning

12/ Summary II Getting the grid services going in time for LHC will be even harder than we think today The service for LHC must be in permanent operation by September CERN↔Tier-1s ↔ major Tier-2s So we will spend the first part of 2006 in installation and commissioning  the technology we use must be working (really working) by the end of 2006  a year from now we will have to decide which middleware we are going to use From now until the end of 2006 we have to turn prototypes into pilot services first data Initial service in operation Decisions on final core middleware Demonstrat e core data handling and batch analysis Installation and commissioning Prototype services  pilot services

12/ Conclusion (the last “last” slide) There are still many questions about grids & data handling EGEE provides LCG with opportunities - to develop an operational grid in an international multi-science context to influence the evolution of a generic middleware package But the LHC clock is ticking - deadlines will dictate simplicity and pragmatism LCG has long-term requirements – and at present EGEE is a two-year project LCG must encompass non-European resources and grids (based on different technologies) No shortage of challenges and opportunities first data Initial service in operation Decisions on final core middleware Demonstrat e core data handling and batch analysis Installation and commissioning