Accounting For Multicore Jobs John Gordon, STFC, UK Scientific Computing Department, APEL Team MB 17 th March 2015.

Slides:



Advertisements
Similar presentations
HTCondor and the European Grid Andrew Lahiff STFC Rutherford Appleton Laboratory European HTCondor Site Admins Meeting 2014.
Advertisements

HTCondor within the European Grid & in the Cloud
Accounting, ‘the last A’ John Gordon Amsterdam Workshop, May 13 th 2005.
Accounting Update Dave Kant Grid Deployment Board Nov 2007.
Storage Accounting John Gordon, STFC GDB June 2012.
LCG Accounting Reporting Update John Gordon, CCLRC LCG Grid Deployment Board 5 th April 2006.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI GPGPU Accounting John Gordon STFC 09/04/2013 EGI CF – Accounting and Billing1.
ARC Accounting John Gordon. Limitations Resilience – Religious objection to using the BDII for service discovery so only one message broker is hardcoded.
Accounting in LCG Dave Kant CCLRC, e-Science Centre.
Grid Operations Centre LCG Accounting Trevor Daniels, John Gordon GDB 8 Mar 2004.
Multicore Accounting John Gordon, STFC-RAL WLCG MB, July 2015.
GDB March User-Level, VOMS Groups and Roles Dave Kant CCLRC, e-Science Centre.
Information System Status and Evolution Maria Alandes Pradillo, CERN CERN IT Department, Grid Technology Group GDB 13 th June 2012.
LCG Accounting John Gordon Grid Deployment Board 13 th January 2004.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Global Accounting in the Grid and Cloud John Gordon, STFC HEPiX, Beijing.
Accounting Update Stuart Pullinger, STFC Scientific Computing Department, APEL Team GDB 10 th December 2014.
Storage Accounting John Gordon, STFC GDB March 2013.
EMI INFSO-RI Accounting John Gordon (STFC) APEL PT Leader.
WLCG Middleware Support II Markus Schulz CERN-IT-GT May 2011.
LCG Introduction John Gordon, STFC GDB December14 th 2011.
LCG Introduction John Gordon, STFC GDB June 8 th 2011.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES GGUS Ticket review T1 Service Coordination Meeting 2010/10/28.
Accounting Update John Gordon and Stuart Pullinger January 2014 GDB.
LCG Report from GDB John Gordon, STFC-RAL MB meeting February24 th, 2009.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks John Gordon SA1 Face to Face CERN, June.
APEL Cloud Accounting Status and Plans APEL Team John Gordon.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks APEL CPU Accounting in the EGEE/WLCG infrastructure.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Accounting Old and New Requirements John Gordon Revised 22/3/12.
LCG WLCG Accounting: Update, Issues, and Plans John Gordon RAL Management Board, 19 December 2006.
LCG Accounting Update John Gordon, CCLRC-RAL WLCG Workshop, CERN 24/1/2007 LCG.
LCG User Level Accounting John Gordon CCLRC-RAL LCG Grid Deployment Board October 2006.
Eygene Ryabinkin, on behalf of KI and JINR Grid teams Russian Tier-1 status report May 9th 2014, WLCG Overview Board meeting.
LCG Accounting/Reporting John Gordon, STFC MB November 9 th 2011.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI-InSPIRE APEL for Accounting John Gordon, Stuart Pullinger STFC.
RI EGI-InSPIRE RI UMD 2 Decommissioning Status Cristina Aiftimiei EGI.eu.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Andrea Sciabà Ideal information system - CMS Andrea Sciabà IS.
LCG Issues from GDB John Gordon, STFC WLCG MB meeting September 28 th 2010.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Accounting Requirements Stuart Pullinger STFC 09/04/2013 EGI CF – Accounting.
LCG Pilot Jobs + glexec John Gordon, STFC-RAL GDB 7 December 2007.
WLCG Operations Coordination report Maria Alandes, Andrea Sciabà IT-SDC On behalf of the WLCG Operations Coordination team GDB 9 th April 2014.
Installation Accounting Status Flavia Donno CERN/IT-GS WLCG Management Board, CERN 28 October 2008.
Accounting John Gordon WLC Workshop 2016, Lisbon.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Accounting Portal Pablo Rey, Javier Lopez.
The HEPiX IPv6 Working Group David Kelsey (STFC-RAL) EGI OMB 19 Dec 2013.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Storage Accounting John Gordon, STFC OMB August 2013.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Accounting Requirements Stuart Pullinger STFC 09/04/2013 EGI CF – Accounting.
LCG Accounting Update John Gordon, CCLRC-RAL 10/1/2007.
CERN LCG1 to LCG2 Transition Markus Schulz LCG Workshop March 2004.
John Gordon EMI TF and EGI CF March 2012 Accounting Workshop.
Accounting Update John Gordon. Outline Multicore CPU Accounting Developments Cloud Accounting Storage Accounting Miscellaneous.
Storage Accounting John Gordon STFC GDB, Lyon 6 th April2011 GDB January 2012.
Accounting Update Dave Kant, John Gordon RAL Javier Lopez, Pablo Rey Mayo CESGA.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI APEL Regional Accounting Alison Packer (STFC) Iván Díaz Álvarez (CESGA) APEL.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI gLite-APEL Migration Status John Gordon OMB, Vilnius, 14/4/2011.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Accounting task force T. Ferrari/EGI.eu Accounting task force/TCB meeting.
Multicore Accounting John Gordon, STFC-RAL WLCG Operations Coordination, 1 st October 2015.
HTCondor Accounting Update
HTCondor Accounting Update
GridPP37, Ambleside Adrian Coveney (STFC)
John Gordon STFC OMB 26 July 2011
The New APEL Client Will Rogers, STFC.
Benchmarking Changes and Accounting
CREAM Status and Plans Massimo Sgaravatto – INFN Padova
Raw Wallclock in APEL John Gordon, STFC-RAL
JRA1.4 - Account for different resource types
John Gordon, STFC GDB October 12th 2011
Cristina del Cano Novales STFC - RAL
New Types of Accounting Beyond CPU
APEL as a Global Accounting Repository
User Accounting Integration Spreading the Net.
Presentation transcript:

Accounting For Multicore Jobs John Gordon, STFC, UK Scientific Computing Department, APEL Team MB 17 th March 2015

Publishing of multicore accounting Prototype Portal View Encourage remaining sites to start publishing OMB, December Overview

Accounting of Multicore jobs requires publishing to the EMI3 APEL database. Accounting of Multicore jobs requires publishing to the EMI3 APEL database. All EGI sites have now migrated to EMI3 accounting clients. There are several ways of publishing: – The apel client. – ARC CE. NDGF sites publish via SGAS. SGAS recently migrated to SSM2 so Ncores and ncpus are published. – Other ARC CEs use JURA which publishes direct to APEL from each CE so there is no site database. Ncores and ncpus are published. – OSG are planning the change to SSM2. They will start to publish cores at the same time. Tests have been successful and they should start publishing in production soon. – NIKHEF who publish from their own accounting database migrated in December. – CERN who publish from their own accounting database, migrated early this year. – Italian sites have all(?) migrated from DGAS to use the standard APEL client so they (can) now publish cores. – Other middleware stacks like Globus, Unicore, DTG, QCG went straight to SSM2. (not relevant to WLCG) OMB, December Multicore

APEL parser gathers data on number of cpus and cores from the batch systems provided an option is switched on. parallel=true This option is off by default so multicore sites need reminding to turn it on. First priority is to get sites publishing from now. If they want to backdate their publishing they will need to re parse their batch logs APEL will provide detailed instructions on this We plan to change the default setting to parallel=true in the next release but this will only take effect for fresh installations. An update does not overwrite the local config. You would not want it to. Currently cannot retrieve number of cpus from (S)GE. OMB, December Apel Client

devel.egi.eu/show.php?query=sum_normcpu&startYear=2015&startMonth=3&endYear=2015&endMonth=3&yrange=REGION &xrange=NUMBER+PROCESSORS&groupVO=lhc&chart=GRBAR&scale=LIN&localJobs=onlygridjobs devel.egi.eu/show.php?query=sum_normcpu&startYear=2015&startMonth=3&endYear=2015&endMonth=3&yrange=REGION &xrange=NUMBER+PROCESSORS&groupVO=lhc&chart=GRBAR&scale=LIN&localJobs=onlygridjobs The development portal now has a view including ncores (Processors) and ncpus(Nodes) for those sites which publish them. Views include Wallclock and Wallclock*ncores Efficiency based on Wallclock*ncores Once everyone is publishing then this portal view will be complete Ncores=0 highlights sites who have not set parallel=true This view can also display data by SubmitHost which shows which CEs at a site are publishing OMB, December Portal Multicore View

Multicore reporting is available for all production WLCG sites through production APEL repository. (~ OSG v soon) Cannot be visible in production portal until historical data can be integrated. Not all sites have configured their clients to send data on number of cores. Status

14.6% not publishing cores 70.5% single core 12.3% 8 cores March 2015Normalised CPU

Countries publishing cores from all sites: CZ, France, NDGF, Slovakia Whole countries not publishing Multicore – Armenia, China, Hungary, Turkey Countries with a minority publishing AfricaArabia, AsiaPacific, Spain/Portugal, Romania There are still a lot of sites who publish cores from only a subset of their CEs. If these are CEs with multicore queues then we are capturing that. Do we need every CE to publish?

Efficiency calculated from average cores

Drill Down to a Country

HEP&query=sum_normcpu&startYear=2014&startMonth=12&endYear=2014&endMonth=12&yrange=Submi tHost&xrange=NUMBER+PROCESSORS&groupVO=all&chart=GRBAR&scale=LIN&localJobs=onlygridjobs OMB, December

BEgrid-ULB- VUB BEIJING-LCG2 BUDAPEST CA-MCGILL- CLUMEQ-T2 CBPF CESGA CIEMAT-LCG2 EELA-UTFSM GR-07-UOI- HEPLAB HEPHY-UIBK Hephy-Vienna KR-KNU-T3 LCG_KNU LIP-Coimbra LIP-Lisbon NCG-INGRID-PT NCP-LCG2 NIHAM RO-11-NIPNE RO-14-ITIM RO-16-UAIC RRC-KI Ru-Troitsk-INR- LCG2 SFU-LCG2 T2-TH-ALICE- NSTDA T2-TH-CUNSTDA T2-TH-SUT OMB, December Name and Shame HG-04-CTI-CEID ICN-UNAM IFCA-LCG2 IFIC-LCG2 IFISC-GRID IN-DAE-VECC- 02 INAF-TS INDIACMS-TIFR INFN-CATANIA INFN-LNL-2 INFN-NAPOLI- CMS INFN-PISA JP-KEK-CRC-02 Kharkov-KIPT- LCG2 KR-KISTI-GSDC-01 T3_HU_Debrec en TH-NECTEC-LSR TOKYO-LCG2 TR-03-METU TR-10-ULAKBIM TW-NCUHEP UA-BITP UB-LCG2 UKI-LT2-UCL-HEP UKI-SOUTHGRID- BHAM-HEP UNI-SIEGEN-HEP UPorto USC-LCG2 WUT

What else does WLCG Need? MB and VOs can drill down into countries to see which sites are not yet publishing Remind all sites to change all of their CEs, not just multicore queues. How to extract data for WLCG Reports.