Accounting Update Dave Kant Grid Deployment Board Nov 2007.

Slides:



Advertisements
Similar presentations
HTCondor and the European Grid Andrew Lahiff STFC Rutherford Appleton Laboratory European HTCondor Site Admins Meeting 2014.
Advertisements

FP7-INFRA Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.
Makrand Siddhabhatti Tata Institute of Fundamental Research Mumbai 17 Aug
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
Summary of Accounting Discussion at the GDB in Bologna Dave Kant CCLRC, e-Science Centre.
:: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: GridKA School 2009 MPI on Grids 1 MPI On Grids September 3 rd, GridKA School 2009.
A.Guarise – F.Rosso 1 Enabling Grids for E-sciencE INFSO-RI Comprehensive Accounting Views on large computing farms. Andrea Guarise & Felice Rosso.
Apr 30, 20081/11 VO Services Project – Stakeholders’ Meeting Gabriele Garzoglio VO Services Project Stakeholders’ Meeting Apr 30, 2008 Gabriele Garzoglio.
Monitoring in EGEE EGEE/SEEGRID Summer School 2006, Budapest Judit Novak, CERN Piotr Nyczyk, CERN Valentin Vidic, CERN/RBI.
JSPG: User-level Accounting Data Policy David Kelsey, CCLRC/RAL, UK LCG GDB Meeting, Rome, 5 April 2006.
APEL & MySQL Alison Packer Richard Sinclair. APEL Accounting Processor for Event Logs extracts job information by parsing batch system (PBS, LSF, SGE.
Accounting in LCG Dave Kant CCLRC, e-Science Centre.
Steve Traylen PPD Rutherford Lab Grid Operations PPD Christmas Lectures Steve Traylen RAL Tier1 Grid Deployment
Grid Operations Centre LCG Accounting Trevor Daniels, John Gordon GDB 8 Mar 2004.
Some Title from the Headrer and Footer, 19 April Overview Requirements Current Design Work in Progress.
June 24-25, 2008 Regional Grid Training, University of Belgrade, Serbia Introduction to gLite gLite Basic Services Antun Balaž SCL, Institute of Physics.
Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Usage of virtualization in gLite certification Andreas Unterkircher.
GDB March User-Level, VOMS Groups and Roles Dave Kant CCLRC, e-Science Centre.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
LCG Accounting John Gordon Grid Deployment Board 13 th January 2004.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Priorities update Andrea Sciabà IT/GS Ulrich Schwickerath IT/FIO.
Storage Accounting John Gordon, STFC GDB March 2013.
HLRmon accounting portal DGAS (Distributed Grid Accounting System) sensors collect accounting information at site level. Site data are sent to site or.
EMI INFSO-RI Argus Policies in Action Valery Tschopp (SWITCH) on behalf of the Argus PT.
Local Job Accounting Cristina del Cano Novales STFC-RAL.
SAM Sensors & Tests Judit Novak CERN IT/GD SAM Review I. 21. May 2007, CERN.
Accounting Update John Gordon and Stuart Pullinger January 2014 GDB.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks John Gordon SA1 Face to Face CERN, June.
Accounting non-Grid Use John Gordon Management Board 7/6/2007.
EGEE-II INFSO-RI Enabling Grids for E-sciencE YAIM Overview MiMOS Grid tutorial HungChe, ASGC OPS Team.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks APEL CPU Accounting in the EGEE/WLCG infrastructure.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA3 partner collaboration tasks & process.
LCG WLCG Accounting: Update, Issues, and Plans John Gordon RAL Management Board, 19 December 2006.
LCG Accounting Update John Gordon, CCLRC-RAL WLCG Workshop, CERN 24/1/2007 LCG.
LCG User Level Accounting John Gordon CCLRC-RAL LCG Grid Deployment Board October 2006.
Accounting in LCG/EGEE Can We Gauge Grid Usage via RBs? Dave Kant CCLRC, e-Science Centre.
LCG Accounting/Reporting John Gordon, STFC MB November 9 th 2011.
Accounting in LCG Dave Kant CCLRC, e-Science Centre.
APEL Accounting Update Dave Kant CCLRC, e-Science Centre.
HLRmon accounting portal The accounting layout A. Cristofori 1, E. Fattibene 1, L. Gaido 2, P. Veronesi 1 INFN-CNAF Bologna (Italy) 1, INFN-Torino Torino.
Dave Kant LCG Accounting Overview GDA 7 th June 2004.
INFSO-RI Enabling Grids for E-sciencE DGAS, current status & plans Andrea Guarise EGEE JRA1 All Hands Meeting Plzen July 11th, 2006.
SAM Status Update Piotr Nyczyk LCG Management Board CERN, 5 June 2007.
Status of gLite-3.0 deployment and uptake Ian Bird CERN IT LCG-LHCC Referees Meeting 29 th January 2007.
John Gordon Grid Accounting Update John Gordon (for Dave Kant) CCLRC e-Science Centre, UK LCG Grid Deployment Board NIKHEF, October.
JRA1 Meeting – 09/02/ Software Configuration Management and Integration EGEE is proposed as a project funded by the European Union under contract.
Accounting in LCG Dave Kant CCLRC, e-Science Centre.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Overview of gLite, the EGEE middleware Mike Mineter Training Outreach Education National.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Accounting Portal Pablo Rey, Javier Lopez.
Enabling Grids for E-sciencE APEL Accounting update Dave Kant (presented by Jeremy Coles) 2 nd EGEE/LCG Operations Workshop Bologna 25.
DGAS Distributed Grid Accounting System INFN Workshop /05/1009, Palau Giuseppe Patania Andrea Guarise 6/18/20161.
APEL Architecture Alison Packer. Overview Grid jobs accounting tool APEL Client software - installed in sites (CEs, gLite- APEL node) APEL Server accepts.
WLCG Accounting Task Force Update Julia Andreeva CERN GDB, 8 th of June,
Enabling Grids for E-sciencE INFN Workshop – May 7-11 Rimini 1 Grid Accounting Status at INFN Riccardo Brunetti INFN-TORINO.
INFSO-RI Enabling Grids for E-sciencE Padova site report Massimo Sgaravatto On behalf of the JRA1 IT-CZ Padova group.
Accounting Update John Gordon. Outline Multicore CPU Accounting Developments Cloud Accounting Storage Accounting Miscellaneous.
Enabling Grids for E-sciencE Claudio Cherubino INFN DGAS (Distributed Grid Accounting System)
Accounting Update Dave Kant, John Gordon RAL Javier Lopez, Pablo Rey Mayo CESGA.
GDB July APEL Accounting Summary Dave Kant Rutherford Appleton Laboratory.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Simone Campana (CERN) Job Priorities: status.
CREAM Status and plans Massimo Sgaravatto – INFN Padova
Status of the SRM 2.2 MoU extension
Andreas Unterkircher CERN Grid Deployment
Farida Naz Andrea Sciabà
Accounting Portal Pablo Rey, Javier Lopez (CESGA)
Accounting at the T1/T2 Sites of the Italian Grid
The CREAM CE: When can the LCG-CE be replaced?
Cristina del Cano Novales STFC - RAL
User Accounting Integration Spreading the Net.
Presentation transcript:

Accounting Update Dave Kant Grid Deployment Board Nov 2007

Overview User Level Accounting VOMS Groups/Roles APEL Status Tier2 Accounting and Reporting Issues Suggestions

User Level Accounting User Level Accounting Delivered –UserDN captured from CE log files (grid-jobmap logs) –APEL uses the data to build accounting records –Data published to GOC with on-the-fly encryption using APEL public key (1024 bit RSA) –At the GOC data are extracted from RGMA and stored in a Central Accounting Repository. –Data decrypted using APEL private key User Level summary table created On-the-fly encryption using EGEE Portal certificate –Encrypted table pushed to CESGA portal –Portal decrypts data and provides SSL based access to the summaries.

User Level Accounting Testing –We encouraged sites to publish UserDN. –Sites had to manually configure APEL to perform on-the-fly encryption. –Our 2007 sample contains 1067 distinct UserDNs from 33 sites. –No problems seen decrypting UserDNs at the GOC. Observations –NIKHEF publishing their own encrypted UserDN strings Example LCGUserID: HPfh56sbc3AYKDn1Yusxgg Can only attribute usage to the VO

VOMS Groups and Roles UserFQAN –Capture UserFQAN from grid-jobmap log on CE –FQAN chain processed at the GOC to derive Group and Role from the primary part of the chain. –If UserFQAN present, we can use the Group to derive the VO of the user submitted job (otherwise we use the local unix group).

UserFQAN Testing –Our 2007 sample shows the following Groups and Roles for ATLAS | PrimaryGroup | PrimaryRole | | /atlas | Role=lcgadmin | | /atlas | Role=NULL | | /atlas | Role=production | | /atlas/ca | Role=NULL | | /atlas/lcg1 | Role=NULL | | /atlas/nl | Role=NULL | | /atlas/soft-valid | Role=NULL | | /atlas/soft-valid | Role=production | | /atlas/usatlas | Role=production | –Matrix looks reasonable.

APEL Status Production –New release of APEL UPDATE 35 for gLite 3.0. –Main features are:- UserDN and UserFQAN support Joins match SpecInt2000 of the ClusterID to the ResourceIdentity of the CE. Additional accounting table to provide a high-level checksum view of sites accounting database. –This is used by GOC to verify if the site has published all of its accounting data for SAM.

APEL Status Development –Bug fixes and improvements –Critical bug fixes Support for multiple SpecInt200 per CE (Savannah Bug # 28593) –Tested at CERN and IN2P3 –Impact: »Can lead to wrongly assigned SpecInt2000 in accounting »Sites with multiple CE’s that re-evaluate their SI2K numbers Log rotation of grid-jobmap logs based on UTC (Bug # 28592) –Tested at CESGA and CERN –Impact: »Can lead to missing accounting data

APEL Status Development –Enhancements Bug # – Improved LSF Log Parser (Tested at CERN) Bug # – Identification of Gatekeeper logs Bug # – Joining data between Condor and grid-jobmap logs Bug # – Gap Publisher –These bug fixes have been written to the CVS but We have some issues to address concerning ETICS and SLC4 –Bug # –Location of log4j, bouncy-castle and mysql-connector-java libraries in APELs build.xml files and the startup scripts. Consequently, we have not yet produced a new release tag or patch.

APEL Future Work New things in the pipeline (not implemented) –Accounting Local Usage What information do we want? –Attribute usage to the VO –Do we care about local users identity? Or Normalised usage? For a High-level Anonymous summary at VO level –Bug # –Evaluate a summary of all batch log data that did not get included in a grid join. –Attributes usage to the VO, but not at the User-level –Summaries are evaluated on-the-fly every time the publisher executes after the grid join process. Issues –Normalised Usage »Matching the local job to the SpecInt2000 may be problematic as there is no information about the CE in the batch log (?) »This is not a problem for sites that run CE / batch servers on the same node. –Mapping the local unix group to a known VO »Mapping table at the GOC … is there a better method available?

APEL Future Work –Support for MPI Jobs Bug # –IN2P3 have sent some Torque/PBS logs for MPI jobs and verified that APEL does not support MPI. »The total CPU time is correct »Wall time derived is underestimated because PBS publishes this per CPU. »Efficiencies > 100% – APEL needs to take into account the Number of CPUs to get total WCT What information do we want to capture from an MPI job? –CPU and Wall time »Only need to determine the number of CPUs to fix the WCT –Categorise Grid Jobs »Requires modifications to accounting record in order to distinguish them from POGJ.

Resource Trees / MOU Pledges Tier2 –Sue Soffano Spreadsheet Tree representing the Tier2 structure has been delivered. Report showing the “Tier2 MOU SI2K Pledge” against the actual usage delivered.

Issues User Level Accounting –On-the-fly encryption would be better controlled by YAIM (Bug # 31015) Specify options to publish UserDN in site-info.def file. Not Implemented. –Use of Service certificate for encryption. APEL client uses an RSA public key, but not a certificate. –UserDN decryption chain used in production should be implemented by CESGA for the PPS service. Work has started –What happens if the User changes their UserDN? How does the User access their data if they no longer have the old certificate? Do we need a mechanism to track the UserDN history? Case Study; changed institutes and the CA issued a new certificate when the old one expired. /c=uk/o=escience/ou=clrc/l=ral/cn=dave kant /c=uk/o=escience/ou=queenmarylondon/l=physics/cn=dave kant

Issues Tier2 MOU Pledges –We need to make sure that we can distinguish between Tier2s that pledge against all LHC VOs, and those that pledge against a specified VO. Particularly important as some sites appear in multiple Tier2’s because they pledge on a VO-by-VO basis. Assume that if the VO is not in theTier2 name, then the pledge represents a total for the entire LHC –Are there any MOU pledges for storage?

Suggestions More Spreadsheets? –Can we have a Tier1 spreadsheet? –What about VO specific spreadsheets? Clouds-of-Atlas?

Farewell Leaving the project at the end of the week. Thankyou for all your help and support. Goodbye and Good Luck!