John Gordon CCLRC RAL Grid Operations LCG Grid Deployment Board FNAL, 9th October 2003.

Slides:



Advertisements
Similar presentations
Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Torsten Antoni – LCG Operations Workshop, CERN 02-04/11/04 Global Grid User Support - GGUS -
Advertisements

Last update 01/06/ :23 LCG 1Maria Dimou- cern-it-gd Maria Dimou IT/GD Site Registration policy & procedures
Andrew McNab - Manchester HEP - 6 November Old version of website was maintained from Unix command line => needed (gsi)ssh access.
John Gordon and LCG and Grid Operations John Gordon CCLRC e-Science Centre, UK LCG Grid Operations.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
John Gordon CCLRC RAL Grid Operations Centre Update Trevor Daniels LCG Grid Deployment Board 10 th November 2003.
5 November 2001F Harris GridPP Edinburgh 1 WP8 status for validating Testbed1 and middleware F Harris(LHCb/Oxford)
CERN IT Department CH-1211 Geneva 23 Switzerland t The Experiment Dashboard ISGC th April 2008 Pablo Saiz, Julia Andreeva, Benjamin.
OSG Middleware Roadmap Rob Gardner University of Chicago OSG / EGEE Operations Workshop CERN June 19-20, 2006.
Dave Kant Grid Monitoring and Accounting Dave Kant CCLRC e-Science Centre, UK HEPiX at Brookhaven 18 th – 22 nd Oct 2004.
Monitoring in EGEE EGEE/SEEGRID Summer School 2006, Budapest Judit Novak, CERN Piotr Nyczyk, CERN Valentin Vidic, CERN/RBI.
3 June 2004GridPP10Slide 1 GridPP Dissemination Sarah Pearce Dissemination Officer
CERN IT Department CH-1211 Genève 23 Switzerland t MSG status update Messaging System for the Grid First experiences
LHCb planning for DataGRID testbed0 Eric van Herwijnen Thursday, 10 may 2001.
10-Jun-03D.P.Kelsey, LCG-GDB-Security1 LCG/GDB Security (Report from the LCG Security Group) CERN, 10 June 2003 David Kelsey CCLRC/RAL, UK
Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.
8-Jul-03D.P.Kelsey, LCG-GDB-Security1 LCG/GDB Security (Report from the LCG Security Group) RAL, 8 July 2003 David Kelsey CCLRC/RAL, UK
Steve Traylen PPD Rutherford Lab Grid Operations PPD Christmas Lectures Steve Traylen RAL Tier1 Grid Deployment
Grid Operations Centre LCG Accounting Trevor Daniels, John Gordon GDB 8 Mar 2004.
Certification and test activity IT ROC/CIC Deployment Team LCG WorkShop on Operations, CERN 2-4 Nov
GGUS at PEB – –- page 1 LCG Klaus-Peter Mickel, GridKa Karlsruhe LCG-PEB-Meeting ( ) The Global Grid User Support Model (Report of GDB.
15-Dec-04D.P.Kelsey, LCG-GDB-Security1 LCG/GDB Security Update (Report from the Joint Security Policy Group) CERN 15 December 2004 David Kelsey CCLRC/RAL,
Grid Operations Centre LCG SLAs and Site Audits Trevor Daniels, John Gordon GDB 8 Mar 2004.
9-Oct-03D.P.Kelsey, LCG-GDB-Security1 LCG/GDB Security (Report from the LCG Security Group) FNAL 9 October 2003 David Kelsey CCLRC/RAL, UK
Julia Andreeva, CERN IT-ES GDB Every experiment does evaluation of the site status and experiment activities at the site As a rule the state.
Dave Kant Monitoring ROC Workshop Milan 10-11/5/04.
Grid Security Vulnerability Group Linda Cornwall, GDB, CERN 7 th September 2005
E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA gLite Information System Pedro Rausch IF.
LCG Storage Accounting John Gordon CCLRC – RAL LCG Grid Deployment Board September 2006.
LCG Accounting John Gordon Grid Deployment Board 13 th January 2004.
Portal Update Plan Ashok Adiga (512)
Site Validation Session Report Co-Chairs: Piotr Nyczyk, CERN IT/GD Leigh Grundhoefer, IU / OSG Notes from Judy Novak WLCG-OSG-EGEE Workshop CERN, June.
LCG workshop on Operational Issues CERN November, EGEE CIC activities (SA1) Accounting: current status
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
SAM Sensors & Tests Judit Novak CERN IT/GD SAM Review I. 21. May 2007, CERN.
VO Box Issues Summary of concerns expressed following publication of Jeff’s slides Ian Bird GDB, Bologna, 12 Oct 2005 (not necessarily the opinion of)
Certification and test activity ROC/CIC Deployment Team EGEE-SA1 Conference, CNAF – Bologna 05 Oct
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks APEL CPU Accounting in the EGEE/WLCG infrastructure.
LCG WLCG Accounting: Update, Issues, and Plans John Gordon RAL Management Board, 19 December 2006.
LCG Accounting Update John Gordon, CCLRC-RAL WLCG Workshop, CERN 24/1/2007 LCG.
EGEE is a project funded by the European Union under contract IST Roles & Responsibilities Ian Bird SA1 Manager Cork Meeting, April 2004.
LCG User Level Accounting John Gordon CCLRC-RAL LCG Grid Deployment Board October 2006.
Accounting in LCG/EGEE Can We Gauge Grid Usage via RBs? Dave Kant CCLRC, e-Science Centre.
EGEE is a project funded by the European Union under contract INFSO-RI Grid accounting with GridICE Sergio Fantinel, INFN LNL/PD LCG Workshop November.
Gennaro Tortone, Sergio Fantinel – Bologna, LCG-EDT Monitoring Service DataTAG WP4 Monitoring Group DataTAG WP4 meeting Bologna –
INFN GRID Production Infrastructure Status and operation organization Cristina Vistoli Cnaf GDB Bologna, 11/10/2005.
LCG Issues from GDB John Gordon, STFC WLCG MB meeting September 28 th 2010.
INFSO-RI Enabling Grids for E-sciencE Operations Parallel Session Summary Markus Schulz CERN IT/GD Joint OSG and EGEE Operations.
LCG Pilot Jobs + glexec John Gordon, STFC-RAL GDB 7 December 2007.
SAM Status Update Piotr Nyczyk LCG Management Board CERN, 5 June 2007.
Status of gLite-3.0 deployment and uptake Ian Bird CERN IT LCG-LHCC Referees Meeting 29 th January 2007.
John Gordon Grid Accounting Update John Gordon (for Dave Kant) CCLRC e-Science Centre, UK LCG Grid Deployment Board NIKHEF, October.
DataTAG is a project funded by the European Union CERN, 8 May 2003 – n o 1 / 10 Grid Monitoring A conceptual introduction to GridICE Sergio Andreozzi
II EGEE conference Den Haag November, ROC-CIC status in Italy
WP3 WP3 at Budapest 2/9/2002 Steve Fisher / RAL. WP3 Steve Fisher/RAL - 2/9/2002WP3 at Budapest2 Summary News –EDG Retreat –EDG Tutorials –Quality –Release.
TIFR, Mumbai, India, Feb 13-17, GridView - A Grid Monitoring and Visualization Tool Rajesh Kalmady, Digamber Sonvane, Kislay Bhatt, Phool Chand,
LCG Accounting Update John Gordon, CCLRC-RAL 10/1/2007.
CERN LCG1 to LCG2 Transition Markus Schulz LCG Workshop March 2004.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI Services for Distributed e-Infrastructure Access Tiziana Ferrari on behalf.
Monitoring Working Group Update Grid Deployment Board 5 th December, CERN Ian Neilson.
1 Grid2003 Monitoring, Metrics, and Grid Cataloging System Leigh GRUNDHOEFER, Robert QUICK, John HICKS (Indiana University) Robert GARDNER, Marco MAMBELLI,
Accounting Update Dave Kant, John Gordon RAL Javier Lopez, Pablo Rey Mayo CESGA.
Grid Operations Centre LHCC Comprehensive Review Trevor Daniels, John Gordon 25 Nov 2003.
Grid Operations Centre Progress to Aug 03
Regional Operations Centres Core infrastructure Centres
David Kelsey CCLRC/RAL, UK
LCG Operations Centres
a VO-oriented perspective
Site availability Dec. 19 th 2006
Presentation transcript:

John Gordon CCLRC RAL Grid Operations LCG Grid Deployment Board FNAL, 9th October 2003

John Gordon CCLRC RAL Outline Recent Progress Future work

John Gordon CCLRC RAL Progress to Date Website Monitoring Activities Reporting Accounting

John Gordon CCLRC RAL Website Main structure is in place Pages on –participating institutions, –contact information –and monitoring fully operational Marker pages for SLAs, News, Security and Meetings Uses GridSite for updating

John Gordon CCLRC RAL Monitoring Activities Installed a variety of monitoring tools to gain experience of them on a Production Grid –Gppmon –MapCenter –GridICE –CE_mon –RB_mon –Mona Lisa

John Gordon CCLRC RAL Gppmon Submits jobs every hour via globus and CERN RB Coloured dots on map on GOC web Static list of sites, –but easy to update; currently fully up to date Most useful at this stage for quick check of status of CE and RB Needs history; –available in later version but not yet implemented How to check all RBs? –Segmented dots? One map per RB? –Fewer sites/RB?

John Gordon CCLRC RAL GPPmon

John Gordon CCLRC RAL

John Gordon CCLRC RAL MapCenter Checks IP/UDP ports, no sensors. –Set up with help from Franck Bonnassieux Static version running, breaks occasionally Difficult to update –tricky format, needs root dynamic version added to website, –but shows only services in MDS –These are MDSs, BDIIs, CEs and SEs.

John Gordon CCLRC RAL LCG Static MapCenter

John Gordon CCLRC RAL LCG MapCenter

John Gordon CCLRC RAL LCG MapCenter

John Gordon CCLRC RAL GridICE Running at CERN history of jobs run useful accurately shows gppmon jobs running every hour in dteam Shows several hundred Alice, Atlas, CMS and LHCb jobs submitted at end Sep in two batches pattern in all 4 is the same, so presumably a test mainly shown waiting no obvious real use of LCG1 observed yet

John Gordon CCLRC RAL GridICE

John Gordon CCLRC RAL GridICE

John Gordon CCLRC RAL GridICE

John Gordon CCLRC RAL CE_Mon Attempts authentication at every CE every 10 mins (globusrun -authenticate-only) permits reliability and availability to be calculated from user perspective intended to investigate suitability as SLA test now believed reliable enough to begin to extract availability and reliability figures needs web output developing

John Gordon CCLRC RAL RB_Mon Attempts job-list-match every 10 mins to every RB permits reliability and availability to be calculated from user perspective intended to investigate suitability as SLA test not yet quite reliable enough to begin to extract availability and reliability figures needs web output developing

John Gordon CCLRC RAL Monitoring Summary No single tool to do everything Probably need use of several tools for different circumstances Need to evaluate Mona Lisa Would like to add EDG WP7 tools –To non EDG sites –Requires R-GMA –

John Gordon CCLRC RAL EDG-network monitoring

John Gordon CCLRC RAL EDG-WP7 Transition LCG Site NM LCG MON LCG CE/SE edg-ftlog2rgma EDG/LCG Site EDG Site EDG CE/SE edg-ftlog2rgma EDG MON LCG MON LCG CE/SE edg-ftlog2rgma NM EDG CE/SE edg-ftlog2rgma EDG MON LCG Registry + Schema EDG Registry + Schema EDG Archiver LCG Archiver Installe d by EDG WP7 Network and file transfers Metrics Current Phase 1 Phase 2

John Gordon CCLRC RAL Reporting RAL using the tools to monitor LCG1 summaries of gppmon, CE_Mon and RB_Mon sent to LCG-Rollout list twice a week so far have helped to diagnose several problems –need to set GLOBUS_TCP_PORT_RANGE env variable for globus submits –communication problems to Hungary –CE queue and site name inconsistencies –requirements for firewall to permit access to certain ports

John Gordon CCLRC RAL Accounting Batch systems already accumulating batch records and/or process accounts in their local formats define a schema for interchange of accounting data develop two filters to convert from local accounts to schema (eg PBS and LSF) Pull data to a central repository (or two) Store in an accounting DB Display front-ends already exist –Release 1 – information for VO –Release 2 – information per user Planning and evaluation phase

John Gordon CCLRC RAL SLAs Many aspects to an SLA –Schedule –Availability –Reliability –Performance –Throughput tests already running for CE and RB need script to extract reliability and availability –next are MDS servers Need discussion on performance and throughput indicators Work on agreed definition of SLA template

John Gordon CCLRC RAL Security Policy drafting for GDB (with Security Group) complete some GOC-related procedures remain to be drafted: Procedures for Resource Administrators Procedures for Site Self-Audit Rules for Service Level Agreement

John Gordon CCLRC RAL Local Ops and Admin Group to be set up (in November?) to discuss GOC operational procedures Draft ToR with GOC Steering Group

John Gordon CCLRC RAL User Support Liaison Met with the GUS from Karlsruhe agreed to use single Remedy at Karlsruhe –For GUS and GOC –Interchange schema later

John Gordon CCLRC RAL GOC Rollout Plan called for second GOC soon –At level of a few staff Are we ready for this? –cf EGEE with multiple ROCs –More staff and more duties Agreed there should be combined GUS/GOC if possible –What is procedure to decide who?

John Gordon CCLRC RAL GOC Steering Group Defined but has not yet met –Trevor Daniels, Cristina Vistoli, Markus Schulz –Rolf Rumler, Claude Wang, Eric Yen –Ian Fisk, Bruce Gibbard, John Gordon First phone conference 16 th October Address Priorities –Accounting –Gap Analysis of Monitoring –Wider Operations Group? Forum for sysadmins? –Performance indicators for SLA

John Gordon CCLRC RAL Future Work Web Monitoring

John Gordon CCLRC RAL Web Integrate GOC with LCG web Educate people how to update their information –Demo of GridSite

John Gordon CCLRC RAL Accounting Planning and evaluation phase Probably two months work –Manual prototypes before then –Release 1 – information for VO –Release 2 – information per user

John Gordon CCLRC RAL Monitoring Wider use of monitoring Leading to gap analysis And possible development Extend network monitoring from EDG WP7

John Gordon CCLRC RAL Summary A lot of work has gone into a variety of GOC tools and infrastructure Now need to –engage the wider community –commission required developments