EGEE-II INFSO-RI-031688EGEE and gLite are registered trademarks The EGEE Production Grid Ian Bird EGEE Operations Manager HEPiX Jefferson Lab, 12 th October.

Slides:



Advertisements
Similar presentations
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Why Grids Matter to Europe Bob Jones EGEE.
Advertisements

LCG WLCG Operations John Gordon, CCLRC GridPP18 Glasgow 21 March 2007.
INFSO-RI Enabling Grids for E-sciencE The EGEE project Fabrizio Gagliardi Project Director EGEE CERN, Switzerland Research Infrastructures.
An overview of the EGEE project Bob Jones EGEE Technical Director DTI International Technology Service-GlobalWatch Mission CERN – June 2004.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Dr Ian Bird LCG Project Leader 15 th April 2009 Visit of Spanish Royal Academy.
Assessment of Core Services provided to USLHC by OSG.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE Grid Infrastructure and Operations Maite.
Porting applications to EU-IndiaGrid: EGEE Marco Verlato EU-IndiaGrid Workshop April 2007 Bangalore, India.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Overview of the EGEE project and the gLite middleware Gergely Sipos MTA SZTAKI
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE Operations Ian Bird, CERN IT/GD LHCC.
1 Introduction to EGEE-II Antonio Fuentes Tutorial Grid Madrid, May 2007 RedIRIS/Red.es (Slices of Bob Jone, Director of EGEE-II.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Configuring and Maintaining EGEE Production.
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Related Projects Dieter Kranzlmüller Deputy.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Grid projects in Europe Giuseppe Andronico.
Enabling Grids for E-sciencE ENEA and the EGEE project gLite and interoperability Andrea Santoro, Carlo Sciò Enea Frascati, 22 November.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Steven Newhouse EGEE’s plans for transition.
INFSO-RI Enabling Grids for E-sciencE EGEE - a worldwide Grid infrastructure opportunities for the biomedical community Bob Jones.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Technical Overview EGEE-II’s achievements.
10/24/2015OSG at CANS1 Open Science Grid Ruth Pordes Fermilab
Ian Bird LCG Deployment Manager EGEE Operations Manager LCG - The Worldwide LHC Computing Grid Building a Service for LHC Data Analysis 22 September 2006.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks PPS All sites Meeting: Introduction & Agenda.
Responsibilities of ROC and CIC in EGEE infrastructure A.Kryukov, SINP MSU, CIC Manager Yu.Lazin, IHEP, ROC Manager
GridPP Deployment & Operations GridPP has built a Computing Grid of more than 5,000 CPUs, with equipment based at many of the particle physics centres.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE – paving the way for a sustainable infrastructure.
INFSO-RI Enabling Grids for E-sciencE Plan until the end of the project and beyond, sustainability plans Dieter Kranzlmüller Deputy.
EGEE-II INFSO-RI Enabling Grids for E-sciencE An Introduction to the EGEE Project Presented by Min Tsai ISGC 2007, Taipei With thanks.
INFSO-RI Enabling Grids for E-sciencE In silico docking on EGEE infrastructure, the case of WISDOM Nicolas Jacq LPC of Clermont-Ferrand,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Multi-level monitoring - an overview James.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks State of Interoperability Laurence Field.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE II: an eInfrastructure for Europe and.
Ian Bird LHC Computing Grid Project Leader LHC Grid Fest 3 rd October 2008 A worldwide collaboration.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Ian Bird SA1 Activity Leader IT Department,
The LHC Computing Grid – February 2008 The Challenges of LHC Computing Dr Ian Bird LCG Project Leader 6 th October 2009 Telecom 2009 Youth Forum.
Les Les Robertson LCG Project Leader High Energy Physics using a worldwide computing grid Torino December 2005.
Ian Bird LCG Deployment Area Manager & EGEE Operations Manager IT Department, CERN Presentation to HEPiX 22 nd October 2004 LCG Operations.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA1: Grid Operations Maite Barroso (CERN)
INFSO-RI Enabling Grids for E-sciencE EGEE SA1 in EGEE-II – Overview Ian Bird IT Department CERN, Switzerland EGEE.
EGEE-III-INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE-III All Activity Meeting Brussels,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The EGEE User Support Infrastructure Torsten.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Antonio Retico CERN, Geneva 19 Jan 2009 PPS in EGEEIII: Some Points.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Towards Seamless Grid Computing The EGEE.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGI Operations Tiziana Ferrari EGEE User.
Status Organization Overview of Program of Work Education, Training It’s the People who make it happen & make it Work.
EGI-InSPIRE Steven Newhouse Interim EGI.eu Director EGI-InSPIRE Project Director Technical Director EGEE-III 1GDB - December 2009.
INFSO-RI Enabling Grids for E-sciencE An overview of EGEE operations & support procedures Jules Wolfrat SARA.
FESR Consorzio COMETA - Progetto PI2S2 Porting a program to run on the Grid Marcello Iacono Manno Consorzio COMETA
INFSO-RI Enabling Grids for E-sciencE The EGEE Project Owen Appleton EGEE Dissemination Officer CERN, Switzerland Danish Grid Forum.
EGEE is a project funded by the European Union under contract IST Roles & Responsibilities Ian Bird SA1 Manager Cork Meeting, April 2004.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations Automation Team Kickoff Meeting.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Ian Bird All Activity Meeting, Sofia
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA2 Networking support for EGEE III Xavier.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Technical Overview EGEE-II’s achievements.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operating the EGEE Grid Presented by Mike.
WLCG Status Report Ian Bird Austrian Tier 2 Workshop 22 nd June, 2010.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Overview of gLite, the EGEE middleware Mike Mineter Training Outreach Education National.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The EGEE project and the future of European.
INFSO-RI Enabling Grids for E-sciencE EGEE general project update Fotis Karayannis EGEE South East Europe Project Management Board.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Introduction to Grids and the EGEE project.
Bob Jones EGEE Technical Director
Regional Operations Centres Core infrastructure Centres
Ian Bird GDB Meeting CERN 9 September 2003
Long-term Grid Sustainability
EGEE support for HEP and other applications
The EGEE Production Grid A Bird’s-Eye View
How To Integrate an Application on Grid
Presentation transcript:

EGEE-II INFSO-RI EGEE and gLite are registered trademarks The EGEE Production Grid Ian Bird EGEE Operations Manager HEPiX Jefferson Lab, 12 th October 2006 Enabling Grids for E-sciencE

EGEE-II INFSO-RI JLab; 9 th -13 th October Outline Some history –What led up to where we are now? –The EGEE project What is the EGEE grid infrastructure today? –What has been achieved? –How is it used? –How does it compare and relate to other production grids? Outlook

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October Some history … LHC  EGEE Grid 1999 – Monarc Project –Early discussions on how to organise distributed computing for LHC 2000 – growing interest in grid technology –HEP community was the driver in launching the DataGrid project EU DataGrid project –middleware & testbed for an operational grid – LHC Computing Grid – LCG –deploying the results of DataGrid to provide a production facility for LHC experiments – EU EGEE project phase 1 –starts from the LCG grid –shared production infrastructure –expanding to other communities and sciences – EU EGEE-II –Building on phase 1 –Expanding applications and communities … … and in the future – Worldwide grid infrastructure?? –Interoperating and co-operating infrastructures? CERN

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October The EGEE project EGEE - €32 M –1 April 2004 – 31 March 2006 –71 partners in 27 countries, federated in regional Grids EGEE-II - €35 M –1 April 2006 – 31 March 2008 –91 partners in 32 countries –13 Federations Objectives –Large-scale, production-quality infrastructure for e-Science –Attracting new resources and users from industry as well as science –Improving and maintaining “gLite” Grid middleware

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October The EGEE Infrastructure Certification testbeds (SA3) Pre-production service Production service Test-beds & Services Operations Coordination Centre Regional Operations Centres Global Grid User Support EGEE Network Operations Centre (SA2) Operational Security Coordination Team Support Structures Operations Advisory Group (+NA4) Joint Security Policy GroupEuGridPMA (& IGTF) Grid Security Vulnerability Group Security & Policy Groups Infrastructure: Physical test-beds & services Support organisations & procedures Policy groups

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October Certification & release preparation The goal is to produce a middleware distribution that can be deployed widely –Not the same as middleware releases from development projects –More like a Linux distribution – bringing together many pieces from several sources Extensive certification test-bed: –Close to 100 machines involved, CERN + partners Emulate the main deployment environments Certification testing: –Installation and configuration –Component (service) functionality –System testing (trying to emulate real workloads and stress testing) –Beginning to use virtualization to simplify the testing environment Deployment into the pre- production system –Final step of certification – validation by real sites –Validation by applications – also allows to prepare apps for new versions

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October Pre-production service Pre-production service is now ~ 20 sites Provides access to some 500 CPU –Some sites allow access to their full production batch systems for scale tests Sites install and test different configurations and sets of services –Try to get good feeling for the quality of the release or updates before general release to production –Feedback to: certification, integration, developers, etc. P-PS is now used in the way it was intended –For some time it was acting as a second certification test-bed for the gLite- 1.x branch –Some services may be demonstrated in this environment before going to production (or they may need more work)

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October Production service sites Size of the infrastructure today: 196 sites in 42 countries ~ CPU ~ 3 PB disk, + tape MSS CPU

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October Usage of the infrastructure >50k jobs/day ~7000 CPU-months/month

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October Non-LHC VOs Workloads of the “other VOs” start to be significant – approaching 8- 10K jobs per day; and 1000 cpu-months/month one year ago this was the overall scale of work for all VOs Workloads of the “other VOs” start to be significant – approaching 8- 10K jobs per day; and 1000 cpu-months/month one year ago this was the overall scale of work for all VOs

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October Use of the infrastructure 20k jobs running simultaneously

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October CPU Usage Virtual Organizations Jan. ’06 Sep. ’06

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October Use for massive data transfer Large LHC experiments now transferring ~ 1PB/month each

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October Applications on EGEE More than 25 applications from an increasing number of domains –Astrophysics –Computational Chemistry –Earth Sciences –Financial Simulation –Fusion –Geophysics –High Energy Physics –Life Sciences –Multimedia –Material Sciences –….. Application types: Simulation Bulk Processing Responsive Apps. Workflow Parallel Jobs Legacy Applications

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October Simulation Examples –LHC Monte Carlo simulation –Fusion –WISDOM—malaria/avian flu Characteristics –Jobs are CPU-intensive –Large number of independent jobs –Run by few (expert) users –Small input; large output Needs –Batch-system services –Minimal data management for storage of results ATLAS ITER

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October Drug Discovery WISDOM focuses on in silico drug discovery for neglected and emerging diseases. Malaria — Summer 2005 –46 million ligands docked –1 million selected –1TB data produced; 80 CPU-years used in 6 weeks Avian Flu — Spring 2006 –H5N1 neuraminidase –Impact of selected point mutations on eff. of existing drugs –Identification of new potential drugs acting on mutated N1 Fall 2006 –Extension to other neglected diseases

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October Bulk Processing Examples –HEP processing of raw data, analysis –Earth observation data processing Characteristics –Widely-distributed input data –Significant amount of input and output data Needs –Job management tools (workload management) –Meta-data services –More sophisticated data management

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October Responsive Apps. (I) Examples –Prototyping new applications –Monitoring grid operations –Direct interactivity Characteristics –Small amounts of input and output data –Not CPU-intensive –Short response time (few minutes) Needs –Configuration which allows “immediate” execution (QoS) –Services must treat jobs with minimum latency

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October Responsive Apps. (II) Grid as a backend infrastructure: –gPTM3D: interactive analysis of medical images bioinformatics via web portal –GATE: radiotherapy planning –DILIGENT: digital libraries –Volcano sonification Characteristics –Rapid response: a human waiting for the result! –Many small but CPU-intensive tasks –User is not aware of “grid”! Needs –Interfacing (data & computing) with non-grid application or portal –User and rights management between front-end and grid

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October Workflow Examples –“Bronze Standard”: image registration –Flood prediction Characteristics –Use of grid and non-grid services –Complex set of algorithms for the analysis –Complex dependencies between individual tasks Needs –Tools for managing the workflow itself –Standard interfaces for services (I.e. web-services)

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October Parallel Jobs Examples –Climate modeling –Earthquake analysis –Computational chemistry Characteristics –Many interdependent, communicating tasks –Many CPUs needed simultaneously –Use of MPI libraries Needs –Configuration of resources for flexible use of MPI –Pre-installation of optimized MPI libraries

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October Legacy Applications Examples –Commercial or closed source binaries –Geocluster: geophysical analysis software –FlexX: molecular docking software –Matlab, Mathematics, … Characteristics –Licenses: control access to software on the grid –No recompilation  no direct use of grid APIs! Needs –License server and grid deployment model –Transparent access to data on the grid

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October Grid management: structure Operations Coordination Centre (OCC) –management, oversight of all operational and support activities Regional Operations Centres (ROC) –providing the core of the support infrastructure, each supporting a number of resource centres within its region –Grid Operator on Duty Resource centres –providing resources (computing, storage, network, etc.); Grid User Support (GGUS) –At FZK, coordination and management of user support, single point of contact for users

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October Grid Monitoring Goal: –Proactively monitor operational state & performance of the grid –Trigger corrective actions at sites, ROCs, service managers Many tools used: –Distributed responsibility for tools maintenance and operation –Operator portal, Info sys monitor, SFT/SAM, job monitors, etc. Site Functional Tests (SFT)  Site Availability Monitor (SAM) –Framework to sample/test services at sites and publish results –Can include ad-hoc tests (e.g. VO-specific) in the framework or externally –Allows dynamic look-up by VO of sites that are currently OK for them –SAM: extends the concept to measure service availability –Web service access to the data –Intend to use this to generate trouble tickets and alarms Primary tools of the operator on duty are –Information system monitoring and SFT/SAM

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October Site metrics - availability

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October Support - GGUS

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October The EGEE Network Operations Centre Creating a “Network Support unit” in the EGEE operational model; Tasks: –Receive tickets from NRENs, and forward to GGUS if impact on grid –Receive tickets from GGUS if a network issue –Troubleshoot & follow up with sites or NRENs GGUS Users Support Units ENOC NRENs GÉANT2 EGEENetwork

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October Interoperation Interoperability and interoperation (or co-operation) EGEE has interoperability activities with: (enabling the middlewares to work together) –Open Science Grid (U.S.) – quite far advanced –Nordugrid (ARC) – task in EGEE-II, 4 workshops and ongoing activity –UNICORE – task in EGEE-II –NAREGI (Japan) – 1 workshop, continued activity –GIN (OGF) – active in several areas EGEE has interoperation activities with: (enabling the infrastructures to co-operate) –Open Science Grid – actually in use –Anticipated with NorduGrid (NDGF) for WLCG

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October Interoperating information systems EGEE OSG Naregi Teragrid Pragma Nordugrid

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October Related infrastructure projects DEISA TeraGrid Coordination in SA1 for: EELA, BalticGrid, EUMedGrid, EUChinaGrid, SEE-GRID Interoperation with OSG, NAREGI SA3 : DEISA, ARC, NAREGI

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October Sustainability: Beyond EGEE-II Need to prepare for permanent Grid infrastructure –Maintain Europe’s leading position in global science Grids –Ensure a reliable and adaptive support for all sciences –Independent of short project funding cycles –Modelled on success of GÉANT  Infrastructure managed in collaboration with national grid initiatives

Enabling Grids for E-sciencE EGEE-II INFSO-RI JLab; 9 th -13 th October Summary of status Today we have an operating production infrastructure –Probably the largest in the world, supporting many science domains –Relied upon by several as their primary source of computing We have a managed operations process addressing most areas –Constantly evolving Inter/Co-operation is a fact and is becoming more important very quickly –Several applications need to work across grids – and they need support for that A large fraction of the value of the operations activity is in the intangibles – processes, structures, expertise, etc. We recognise that there are many outstanding problems with the current state of things: reliability and robustness are the focus for the next year