Deployment Summary GridPP11 Jeremy Coles 15th September 2004.

Slides:



Advertisements
Similar presentations
Andrew McNab - Manchester HEP - 22 April 2002 EU DataGrid Testbed EU DataGrid Software releases Testbed 1 Job Lifecycle Authorisation at your site More.
Advertisements

Last update 01/06/ :23 LCG 1Maria Dimou- cern-it-gd Maria Dimou IT/GD Site Registration policy & procedures
Andrew McNab - Manchester HEP - 22 April 2002 EU DataGrid Testbed EU DataGrid Software releases Testbed 1 Job Lifecycle Authorisation at your site More.
Deployment Board Introduction David Kelsey 29 Oct 2004
Communications Deployment parallel session Jeremy Coles 14th September 2004.
Andrew McNab - Manchester HEP - 22 April 2002 UK Rollout and Support Plan Aim of this talk is to the answer question “As a site admin, what are the steps.
Southgrid Status Report Pete Gronbech: February 2005 GridPP 12 - Brunel.
08/11/908 WP2 e-NMR Grid deployment and operations Technical Review in Brussels, 8 th of December 2008 Marco Verlato.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
Dave Kant Grid Monitoring and Accounting Dave Kant CCLRC e-Science Centre, UK HEPiX at Brookhaven 18 th – 22 nd Oct 2004.
BINP/GCF Status Report BINP LCG Site Registration Oct 2009
EGEE ARM-2 – 5 Oct LCG Security Coordination Ian Neilson LCG Security Officer Grid Deployment Group CERN.
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
LCG and HEPiX Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002.
GGF12 – 20 Sept LCG Incident Response Ian Neilson LCG Security Officer Grid Deployment Group CERN.
Core operations Jeremy Coles GridPP28 17 th April 2012 Jeremy Coles GridPP28 17 th April 2012 a b.
Deployment Issues David Kelsey GridPP13, Durham 5 Jul 2005
Security Area in GridPP2 4 Mar 2004 Security Area in GridPP2 “Proforma-2 posts” overview Deliverables – Local Access – Local Usage.
SouthGrid SouthGrid SouthGrid is a distributed Tier 2 centre, one of four setup in the UK as part of the GridPP project. SouthGrid.
Responsibilities of ROC and CIC in EGEE infrastructure A.Kryukov, SINP MSU, CIC Manager Yu.Lazin, IHEP, ROC Manager
GridPP Deployment & Operations GridPP has built a Computing Grid of more than 5,000 CPUs, with equipment based at many of the particle physics centres.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Multi-level monitoring - an overview James.
GridPP Presentation to AstroGrid 13 December 2001 Steve Lloyd Queen Mary University of London.
JRA Execution Plan 13 January JRA1 Execution Plan Frédéric Hemmer EGEE Middleware Manager EGEE is proposed as a project funded by the European.
Owen SyngeTitle of TalkSlide 1 Storage Management Owen Synge – Developer, Packager, and first line support to System Administrators. Talks Scope –GridPP.
LCG/EGEE Security Operations HEPiX, Fall 2004 BNL, 22 October 2004 David Kelsey CCLRC/RAL, UK
15-Dec-04D.P.Kelsey, LCG-GDB-Security1 LCG/GDB Security Update (Report from the Joint Security Policy Group) CERN 15 December 2004 David Kelsey CCLRC/RAL,
CERN-IT Oracle Database Physics Services Maria Girone, IT-DB 13 December 2004.
Production Manager’s Report PMB Jeremy Coles 13 rd September 2004.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA1: Grid Operations Maite Barroso (CERN)
Grid Security Vulnerability Group Linda Cornwall, GDB, CERN 7 th September 2005
INFSO-RI Enabling Grids for E-sciencE EGEE SA1 in EGEE-II – Overview Ian Bird IT Department CERN, Switzerland EGEE.
Trusted Virtual Machine Images a step towards Cloud Computing for HEP? Tony Cass on behalf of the HEPiX Virtualisation Working Group October 19 th 2010.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Andrew McNab - Manchester HEP - 17 September 2002 UK Testbed Deployment Aim of this talk is to the answer the questions: –“How much of the Testbed has.
Status Organization Overview of Program of Work Education, Training It’s the People who make it happen & make it Work.
Rutherford Appleton Lab, UK VOBox Considerations from GridPP. GridPP DTeam Meeting. Wed Sep 13 th 2005.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Using GStat 2.0 for Information Validation.
Documentation (& User Support) Issues Stephen Burke RAL DB, Imperial, 12 th July 2007.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
EMI INFSO-RI SA1 Session Report Francesco Giacomini (INFN) EMI Kick-off Meeting CERN, May 2010.
INFSO-RI Enabling Grids for E-sciencE An overview of EGEE operations & support procedures Jules Wolfrat SARA.
WLCG Laura Perini1 EGI Operation Scenarios Introduction to panel discussion.
Last update 31/01/ :41 LCG 1 Maria Dimou Procedures for introducing new Virtual Organisations to EGEE NA4 Open Meeting Catania.
PIC port d’informació científica EGEE – EGI Transition for WLCG in Spain M. Delfino, G. Merino, PIC Spanish Tier-1 WLCG CB 13-Nov-2009.
GEOSS Common Infrastructure Initial Operating Capability Directions and Discussion Presented to GEO ADC Geneva May
EGEE is a project funded by the European Union under contract IST Roles & Responsibilities Ian Bird SA1 Manager Cork Meeting, April 2004.
EGEE Project Review Fabrizio Gagliardi EDG-7 30 September 2003 EGEE is proposed as a project funded by the European Union under contract IST
EGEE ARM-2 – 5 Oct LCG/EGEE Security Coordination Ian Neilson Grid Deployment Group CERN.
1 Comments to SPI. 2 General remarks Impressed by progress since last review Widespread adoption by experiments and projects Savannah, ExtSoft Build system.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations Automation Team Kickoff Meeting.
1 Update at RAL and in the Quattor community Ian Collier - RAL Tier1 HEPiX FAll 2010, Cornell.
INFSO-RI Enabling Grids for E-sciencE gLite Certification and Deployment Process Markus Schulz, SA1, CERN EGEE 1 st EU Review 9-11/02/2005.
VOMS chapter 1&1/2 Alessandra Forti Sergey Dolgodobrov HEP Sysman meeting 5 December 2005.
II EGEE conference Den Haag November, ROC-CIC status in Italy
Grid Deployment Technical Working Groups: Middleware selection AAA,security Resource scheduling Operations User Support GDB Grid Deployment Resource planning,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CYFRONET site report Marcin Radecki CYFRONET.
17 September 2004David Foster CERN IT-CS 1 Network Planning September 2004 David Foster Networks and Communications Systems Group Leader
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Ian Bird, CERN WLCG Project Leader Amsterdam, 24 th January 2012.
Bob Jones EGEE Technical Director
Regional Operations Centres Core infrastructure Centres
David Kelsey CCLRC/RAL, UK
SA1 Execution Plan Status and Issues
Ian Bird GDB Meeting CERN 9 September 2003
LCG/EGEE Incident Response Planning
LCG Operations Workshop, e-IRG Workshop
Leigh Grundhoefer Indiana University
Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002
Pierre Girard ATLAS Visit
Presentation transcript:

Deployment Summary GridPP11 Jeremy Coles 15th September 2004

Overview What is deployment all about anyway? Who is doing it? Planning and metrics Issue 1: Communications Issue 2: Fabric management Where are we now?

Are the developers bailing out? Who is flying the plane? We have paying passengers – do we know where we are going? … oh, and can we keep it working, navigate, land and offer a real service?

Who is flying the plane? NumberPositionStatus General 1Production managerIn place and fully engaged 1Applications expertIdentified but not formally engaged 2Tier-1 /deployment expertIn place and fully engaged 4Tier-2 coordinatorsIn place and fully engaged 0.5VO managementWill be part time but not yet in place 9.0Hardware supportPost allocated but not yet filled Specialist 1Data and storage managementExisting expert 1Work load managementExisting expert 1Security officerNot yet recruited 1NetworkingStarting in September … introducing the err…. DTEAM + site system administrators + …

Deployment Board Replaces GridPP1 Technical Board Mandate –Determine and oversee execution of tech plan –Report to PMB –Ensure GridPP-wide issues discussed/solved –Provide forum for tech info exchange –Oversee deployment and use of GridPP h/w –Tier1 – Tier2 coordination/liaison –Ensure integration of external tech developments

DB members Production Manager Tier1/A Manager 4 T2 Technical Coordinators HEP SYSMAN chair CERN T0/Deployment Applications Area Coordinator Middleware Area Coordinator Technical experts (invited by DB chair) UK NGS EGEE/Ireland DB chair ~18 people

DB relations PMB DBUB T1ABT2BM/S/NAPPS LCG/EGEE/CERNT0 UK NGS GridPP DTEAM

What must deployment address? Core infrastructure services Resource brokers Informational services Data management services Virtual Organisation management Replica Location Service BDII Grid monitoring Monitor operational performance Monitor operational state Problem resolution + operations support tools Middleware deployment Required local validation of common middleware Feedback issues to LCG/EGEE Continuous upgrade Mechanism(s) Resource induction New site joining procedures Provide support for middleware installation Advise on operational procedures Resource support Respond to and coordinate resolution of fabric problems Engage wider community to resolve new problems User support Provide a support service for users (filter and distribute) Monitor effectiveness of support Provide training and induction courses Documentation (and quality)

Areas (2) Communication Representation within experiments Procedures and mechanisms within community Applications Ensuring local VOs receive support and guidance Participate in testing and validation exercises Components Workload management Data management Storage management Information services Network services Network performance monitoring Demand (aggregate traffic) vs supply (performance) Resource allocation/reservation Inter-grid collaboration Participate in discussions to work closer with other Grids Ensure interoperability of infrastructure and services Service-level agreements Monitor Tier-2 compliance with MoUs Access policies Security Certification authority Implement and monitor policy Incident response Policy management Operations planning Understand usage patterns Capacity planning Monitoring problems log

Navigation No clear plans within LCG for overall deployment – improving Some confusion about EGEE connections GridPP2 project plan is not complete and we have dependencies Currently developing in a “best guess” environment It is not always clear exactly where decisions get made What does the planning environment look like so far? There are already pressing issues to be addressed: What is the UK stance regarding fabric management tools (LCFGng is being phased out) How are we going to measure deployment and operations success – metrics What is the communications plan given that LCG-ROLLOUT has become a gossip column – support, news, problem reporting

Are we communicating…? Areas Grid news – no well defined broadcast route – e.g. middleware updates Site News – operational incidents on Grid, site updates Support – user, deployment Problems – As found by daily tests or discovered by users Issues LCG-ROLLOUT is overloaded! Lack of visibility about what is happening at sites – upgrade, site problem Problems may generate many queries No tracking for support or logging of queries … and therefore poor ability to search for other experiences Options 1)Set up a new news area based on RSS (new entries are placed in categories that people can register to receive updates from) – just use of GOC pages? 2)Establish support desk for GridPP – but there are concerns about expertise 3)DTEAM area & better documentation

An example [LCG-Problems] mail list has 2 members!

Are we going up or down? Metrics Work in progress!

Metrics (2) Work in progress!

Migration to SL3 is starting. Next public release of LCG supports SL3 WNs, certification complete. –Service nodes remain at RH7.3 for now. –LCFGng is not an option SL3 nodes. –LCG supports one install method for SL3. Manual install techique (Actually not very manual) Can be built into any framework already in use –Kickstart and scripts, Cfengine, NPACI Rocks, Quattor, stateless linux or even LCFG This release expected this month. Maintenance

Quattor Community effort for quattor installaion of LCG2 nearing completion. 98% done. Quattor has similar architecture and concept to LCFG. LCFG effort not wasted. Advantages –CERN and the RAL Tier1/A will use quattor for LCG. - Support and self help for others available. –LCG M/W will not be tied to or released with quattor. Disadvantages –A lot to learn before any pay back.

Steve’s 5 questions Once SL3 port is available is RH 7.3 still wanted anywhere? Is an OS other than SL3 needed for GridPP sites and users? Does any site have a conflict with proposed deployment of LCG into SL3? Is there a site to work with RAL learning Quattor? Should the UK use or at least favour one fabric management solution? Yes – probably Quattor Maybe on very few shared sites Need to ask experiments – perhaps if CERN upgrades soon No – most want to move off of RH 7.3 Manchester?

Summary Smooth running. Easy and seamless deployments. Service quality The DTEAM! The plans (& metrics) are being developed – many dependencies LCFG will be phased out. Quattor on SLC3 is coming. LCG2 deployed CPUs LCG-ROLLOUT needs to migrate to news & helpdesk services