EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks SA1: Grid Operations Maite Barroso (CERN)

Slides:



Advertisements
Similar presentations
INFSO-RI Enabling Grids for E-sciencE Operational Security OSCT JSPG March 2006 Ian Neilson, CERN.
Advertisements

EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE-III Program of Work Erwin Laure EGEE-II / EGEE-III Transition Meeting CERN,
EGI: A European Distributed Computing Infrastructure Steven Newhouse Interim EGI.eu Director.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks From ROCs to NGIs The pole1 and pole 2 people.
EGEE-II INFSO-RI Enabling Grids for E-sciencE AP ROC Min-Hong Tsai ASGC SA1 Transition Meeting May 8 th, 2008
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Romanian SA1 report Alexandru Stanciu ICI.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks PoW for the second year Transition to EGI.
EGI: SA1 Operations John Gordon EGEE09 Barcelona September 2009.
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Steven Newhouse EGEE’s plans for transition.
The EGI Blueprint: Grid Operations and Security Migration to the next grid operations era Tiziana Ferrari (Istituto Nazionale di Fisica Nucleare)
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks ?? Athens, May 5-6th 2009 Community Support.
Responsibilities of ROC and CIC in EGEE infrastructure A.Kryukov, SINP MSU, CIC Manager Yu.Lazin, IHEP, ROC Manager
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE – paving the way for a sustainable infrastructure.
Enabling Grids for E-sciencE 1 EGEE III Project Prof.dr. Doina Banciu ICI Bucharest GRID activities in RoGrid Consortium.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks David Kelsey RAL/STFC,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations Automation Team James Casey EGEE’08.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Multi-level monitoring - an overview James.
RI EGI-InSPIRE RI EGI Future activities Peter Solagna – EGI.eu.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE-EGI Grid Operations Transition Maite.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Bob Jones EGEE project director CERN.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America EELA Infrastructure (WP2) Roberto Barbera.
INFSO-RI Enabling Grids for E-sciencE EGEE SA1 in EGEE-II – Overview Ian Bird IT Department CERN, Switzerland EGEE.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks David Fergusson, Emidio Giorgio, Gergely.
EGEE-III-INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE-III All Activity Meeting Brussels,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The EGEE User Support Infrastructure Torsten.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Gergely Sipos Activity Deputy Manager MTA.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Antonio Retico CERN, Geneva 19 Jan 2009 PPS in EGEEIII: Some Points.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Pre-production in EGEEIII Operation principles Antonio Retico EGEE-II / EGEE II SA1.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGI Operations Tiziana Ferrari EGEE User.
EGI-InSPIRE Steven Newhouse Interim EGI.eu Director EGI-InSPIRE Project Director Technical Director EGEE-III 1GDB - December 2009.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks South-West Federation Gabriel Amorós (CSIC)
Enabling Grids for E-sciencE EGEE-II Meeting EGEE-II SA2 activity Tziouvaras Chrysostomos, MSc NTUA, 14 th March 2006.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Robin McConnell NA3 Activity Manager 02.
INFSO-RI Enabling Grids for E-sciencE An overview of EGEE operations & support procedures Jules Wolfrat SARA.
Ian Bird LCG Project Leader On the transition to EGI – Requirements from WLCG WLCG Workshop 24 th April 2008.
WLCG Laura Perini1 EGI Operation Scenarios Introduction to panel discussion.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Resource Allocation in EGEEIII Overview &
PIC port d’informació científica EGEE – EGI Transition for WLCG in Spain M. Delfino, G. Merino, PIC Spanish Tier-1 WLCG CB 13-Nov-2009.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Deliverable DSA1.4 Jules Wolfrat ARM-9 –
EGEE is a project funded by the European Union under contract IST Roles & Responsibilities Ian Bird SA1 Manager Cork Meeting, April 2004.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks User Support for Distributed Computing Infrastructures.
INFSO-RI SA2 ETICS2 first Review Valerio Venturi INFN Bruxelles, 3 April 2009 Infrastructure Support.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations Automation Team Kickoff Meeting.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Ian Bird All Activity Meeting, Sofia
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Grid Oversight in Service Level Agreement environment Małgorzata Krakowian,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA2 Networking support for EGEE III Xavier.
CERN - IT Department CH-1211 Genève 23 Switzerland t IT-GD-OPS attendance to EGEE’09 IT/GD Group Meeting, 09 October 2009.
Resource Provisioning EGI_DS WP3 consolidation workshop, CERN Fotis Karayannis, GRNET.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks What all NGIs need to do: Helpdesk / User.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid is a Bazaar of Resource Providers and.
EGI Process Assessment and Improvement Plan – EGI core services – Tiziana Ferrari FedSM project 1EGI Process Assessment and Improvement Plan (Core Services)
EGI-InSPIRE Project Overview1 EGI-InSPIRE Overview Activities and operations boards Tiziana Ferrari, EGI.eu Operations Unit Tiziana.Ferrari at egi.eu 1.
Setting up NGI operations Ron Trompert EGI-InSPIRE – ROD teams workshop1.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks ROC model assessment AP ROC ShuTing Liao.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks NA5: Policy and International Cooperation.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The Dashboard for Operations Cyril L’Orphelin.
Grid Deployment Technical Working Groups: Middleware selection AAA,security Resource scheduling Operations User Support GDB Grid Deployment Resource planning,
INFSO-RI Enabling Grids for E-sciencE EGEE general project update Fotis Karayannis EGEE South East Europe Project Management Board.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks COD-16 (Transition to EGEE-III) Report to.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations automation team presentazione.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks IT ROC: Vision for EGEE III Tiziana Ferrari.
JRA1 Middleware re-engineering
Bob Jones EGEE Technical Director
JRA2: Quality Assurance
SA1 Execution Plan Status and Issues
Ian Bird GDB Meeting CERN 9 September 2003
Gonçalo Borges on behalf of LIP
Maite Barroso, SA1 activity leader CERN 27th January 2009
Nordic ROC Organization
Presentation transcript:

EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA1: Grid Operations Maite Barroso (CERN) EGEE III Transition meeting, CERN, 6-7 May

Enabling Grids for E-sciencE EGEE-III INFSO-RI EGEE III Transition meeting, CERN, 6-7 May ‘08 Overview Goals Changes in EGEE III Towards a fully distributed operations model Effort Tasks Deliverables and Milestones Organization Conclusion 2

Enabling Grids for E-sciencE EGEE-III INFSO-RI EGEE III Transition meeting, CERN, 6-7 May ‘08 3 EGEE III Operations Goals The provision of a large-scale, production Grid infrastructure that interoperates at many levels, offering reliable services to a wide range of applications –Continuation of the present service Set the groundwork for the migration to a distributed model based on coordination at the European level of National Grid Infrastructures –This is the challenge for the next 2 years, to do this without breaking the 1 st goal (continuation of reliable service) With the constraints: –2 years –Significantly less effort

Enabling Grids for E-sciencE EGEE-III INFSO-RI EGEE III Transition meeting, CERN, 6-7 May ‘08 4 Changes in EGEE-III No major content changes! Mostly in the organization Move from central supervision to central coordination All tasks distributed to ROCs, with OCC or one of the ROCs responsible for coordination Improve coordination and working model so all this is possible and effective Define clear interfaces: between OCC and ROCs(NGIs), between ROCs(NGIs) and sites Test possible transitional organisational structures towards and NGI model

Enabling Grids for E-sciencE EGEE-III INFSO-RI EGEE III Transition meeting, CERN, 6-7 May ‘08 5 Centralized vs. distributed What is our present (EGEE II) model? Grid management Central coordination for all of the tasks, in many cases localised at CERN (team is called OCC: operations Coordinations Centre) Grid operations and support –In general, problem monitoring (SAM) and reporting done centrally by the COD is not well integrated with the daily operations and monitoring carried out at each site –Best effort/informal coordination of operations tool and requirements gathering. Most tools are deployed centrally (main instance run in one region serving the whole infrastructure) Support to VOs, Users, Applications –Central access point for user support, connected to all the ROCs Grid security –Central coordination, with effort from all ROCs and a broad collaboration from sites

Enabling Grids for E-sciencE EGEE-III INFSO-RI EGEE III Transition meeting, CERN, 6-7 May ‘08 6 Centralized vs. distributed What is our target model? ROCs (NGIs) are responsible for day to day operations, without a central organization overseeing them. Set of operations tools supporting this Central body (OCC) responsible for coordination of cross-regional tasks Clear interfaces/targets between OCC and ROCs(NGIs), between ROCs(NGIs) and sites Sites with well developed fabric tools that monitor local and grid services in a common way and trigger alarms directly, so most of the issues are solved at this level

Enabling Grids for E-sciencE EGEE-III INFSO-RI EGEE III Transition meeting, CERN, 6-7 May ‘08 Effort 7

Enabling Grids for E-sciencE EGEE-III INFSO-RI EGEE III Transition meeting, CERN, 6-7 May ‘08 8 TSA1.1: Grid Management 1191 PM Overall coordination of the Operations through the Operations Coordination Centre. ROC Management Monitoring and enforcement of Service Level Agreements Application – Resource Provider Coordination. The Resource Allocation Group is co-chaired by NA4 and SA1. Grid Accounting Interoperability and collaboration. Operation of national or regional Certification Authorities and Registration Authorities where required, including overall “catch-all” authorities for EGEE. Quality assurance

Enabling Grids for E-sciencE EGEE-III INFSO-RI EGEE III Transition meeting, CERN, 6-7 May ‘08 9 Resource Allocation Process of providing virtual organisations with access to compute and storage resources (known issue in EGEE II) All JRUs/NGIs or partners in SA1 are required to commit a certain percentage of their resources to be used by new VOs (seed resources) –“catch-all” or regional VOs for new user groups in the region Funding of 51,000€ to provide additional computing resources for new user communities not linked to the partners of the EGEE consortium: –Installed at a maximum of 3 sites that can guarantee access to the resources with a high level of service for new VOs according to a Service Level Agreement to be defined –The Service Level Agreement and selected sites will be subject to approval of the Project Management Board. The NA4 VO manager’s group will be responsible for identifying new VOs eligible for project support.

Enabling Grids for E-sciencE EGEE-III INFSO-RI EGEE III Transition meeting, CERN, 6-7 May ‘08 10 TSA1.2: Grid operations and support PM Grid Operator on Duty Oversight and management of Grid operations 1st line support for operations problems Run Grid services for production and pre-production services Middleware deployment and support –Coordination of middleware deployment and support for problems –Regional certification of middleware releases (anticipated to be very rare and will require specific justification) Interoperations: local, regional, international Monitoring tools to support Grid operations: Operations Automation Team, responsible for the overall strategy to coordinate tool development

Enabling Grids for E-sciencE EGEE-III INFSO-RI EGEE III Transition meeting, CERN, 6-7 May ‘08 11 Operations tools Task started and evolved following operation needs: –SAM, Gstat, CIC portal, GOCDB, etc No formal coordination Central part of present operations model Operations Automation Team –Improve site reliability by wider deployment of fabric management tools at sites –Devolve central systems, where possible, to regional systems –Create architecture for new shared infrastructure required to support the operational tools –Measure and improve the availabity and reliability of the operational tools themselves –Design SLA compliance tools (availability and reliability calculation) –Collection of usage and accounting information for CPU/Disk/Network –Provide vizualization of the state of infrastructure for site administration, regional operators and project managers –Provide reporting tools for the OCC and project management

Enabling Grids for E-sciencE EGEE-III INFSO-RI EGEE III Transition meeting, CERN, 6-7 May ‘08 12 TSA1.3: Support to Vos, users, applications 890 PMs GGUS management and tools TPM and user support effort. This is staffed by effort from each of the ROCs as one of their mandatory core tasks. Support for middleware related issues is the responsibility of JRA1 and SA3. Dedicated LHC experiment support by the EIS team Regional helpdesk SA1 participation in site and user training. SA1 will work together with NA3 on developing material for on- line training for site administrators.

Enabling Grids for E-sciencE EGEE-III INFSO-RI EGEE III Transition meeting, CERN, 6-7 May ‘08 13 TSA1.4: Grid security 305 PMs A security team responsible for coordinating all aspects of operational security, including responding to security incidents, A team dealing with security vulnerabilities in the middleware and deployment, Responsibility for developing and maintaining the Security Policy and procedures jointly with other Grids, Ensuring the continued existence of a federated identity trust domain, and encouraging the integration of national or community based authentication- authorisation schemes.

Enabling Grids for E-sciencE EGEE-III INFSO-RI EGEE III Transition meeting, CERN, 6-7 May ‘08 14 TSA1. 5 : Activity Management PMs Activity management ROC coordination Coordination with and participation in project technical bodies Oversight and management of specific technical tasks within SA1 Federation reviews Metrics and Quality Team. Ensures that the appropriate sets of metrics are gathered within the operation to monitor the quality of all aspects of the operation, for monitoring SLAs, and for reporting purposes. The partner reviews will be organised by this team. Contributions to general project tasks (conference preparation, reviews, etc.) Production, editing, reviews of milestones and deliverables

Enabling Grids for E-sciencE EGEE-III INFSO-RI EGEE III Transition meeting, CERN, 6-7 May ‘08 15 Deliverables DescriptionDelivery date DSA1.1Global Grid user Support (GGUS) Plan2 DSA1.2Assessment of production service status DSA1.3Report on the status of the regional Operations Centres (ROCs) and national/regional grid integration 14 DSA1.4Progress report on SLA implementation16 DSA1.5Operations Cookbook18

Enabling Grids for E-sciencE EGEE-III INFSO-RI EGEE III Transition meeting, CERN, 6-7 May ‘08 16 Milestones DescriptionDelivery date MSA1.1Operations Automation Strategy1 MSA1.2Operations procedures in place1 MSA1.3Activity Quality Assurance and measurement plan2 MSA1.4Security assessment plan2 MSA1.5SLA Roadmap3 MSA1.6Assessment of the status of user support4 MSA1.7Assessment of infrastructure reliability6 MSA1.8Grid Security Vulnerability and Risk Analysis11 MSA1.9Status report on Interoperations12 MSA1.10Grid Computer Security Incident handling16 MSA1.11Security Policy Integration20

Enabling Grids for E-sciencE EGEE-III INFSO-RI EGEE III Transition meeting, CERN, 6-7 May ‘08 17 Organization Activity coordination: OCC plus ROC managers plus task/subtask responsibles (coordination of a subtask across ROCs) –Phone meetings every 2 weeks, to present plans, track progress, discuss issues –Face to face meetings (3-4 per year) Weekly Operations Meeting and Operations Workshops –Deals with daily operations: information sharing, raise issues

Enabling Grids for E-sciencE EGEE-III INFSO-RI EGEE III Transition meeting, CERN, 6-7 May ‘08 18 Conclusion Moving to a fully distributed model; we have some experience with this, SA1 is partially distributed already Challenge to do this with less effort and in 2 years; no place for duplication, loose initiatives Collaboration is essential; we need an agreed vision as input to EGI, and we need to work together towards this vision Site responsibility for daily operations is the best way of saving effort and simplifying operations at all levels! –We need to provide the tools to facilitate this –We need more site involvement –Site and ROC/NGI partnership should be reinforced