Presentation on theme: "GridPP Monitoring & Accounting Dave Kant CCLRC, e-Science Centre."— Presentation transcript:
GridPP Monitoring & Accounting Dave Kant CCLRC, e-Science Centre
EGEE03, April Monitoring Overview` 1. Overview 2. How Many Jobs on the Grid? 3. LCG/EGEE Monitoring System 4. Putting it all together for GridPP 5. Future Plans
EGEE03, April How Many Jobs on the Grid? As a way to introduce the various tools that are in development in the LCG/EGEE Grid There are different sources for getting estimates about the number of Jobs. Information System Accounting System Resource Brokers
EGEE03, April How Many Jobs on the Grid? One source of information is the monitoring system based on R-GMA Tools which gather information and use the R-GMA backbone for data collection GIIS Monitor Apel Site Functional Tests Tools which create reports RB Logging&Bookkeeping data mining Accounting
EGEE03, April GIIS Monitor GIIS Monitor developed by GOC Taipei (Min Tsai) Tool to display and check information published by the site GIIS Sanity checks, fault detection of information system every 5 minutes Provides an instantaneous snapshot of the number of Jobs
EGEE03, April How Many Jobs on the Grid? Another source of information is the accounting, which as so many sources, is not complete, but covers most of the resources. This is not the case for GridPP resources. Accounting information is based on resource usage published by batch servers
EGEE03, April How Many Jobs on the Grid? Latest source is a data mining tool which can be used to examine RB Logging and Bookkeeping information (via R-GMA) at the user level. https://lxn1192.cern.ch:9443/~judit/job-monitor.cgi
EGEE03, April How Many Jobs on the Grid? A further source is based on the work by the EGEE QA Team They monitor several – but not all – resource brokers on LCG and create reports of their usage. JRA2/index.html JRA2/index.html Statisticts based on aggregated information Job Success and job throughput per VO and per RB Grid efficiency (Execution time vs Waiting Time)
EGEE03, April How Many Jobs on the Grid?
EGEE03, April How Many Jobs on the Grid? Job Duration showing a dominance of Dteam and LHCb jobs which are relatively short lived.
EGEE03, April Site Functional Tests Installation and configuration of a site is quite a complicated procedure. -When there is a new release, sites dont upgrade at the same time. -Some upgrades dont always go smoothly -Unexpected things happen (who turned of the power?) -Day-to-day problems; robustness of service under load? SFT framework consists of a number of tests which probe a site to determine the operational status. This includes all certified sites in EGEE/LCG infrastructure but also testing uncertified sites (for internal certification process performed by ROCs), monitoring sites that are part of gLite Pre- Production Service, and all other sites that are using LCG or gLite middleware
EGEE03, April SFT Site summaries and histories SFT used by ROCs for certification Grid–Ireland SFT SFT runs every 3 hours and writes test results to a database using R-GMA
EGEE03, April GridPP Monitoring Map Links hourly job submission test results to SFT, GSTAT, RSS Feeds and Accounting data GPPMon is a lightweight test which sends a simple job to GridPP resources every hour.
EGEE03, April Future Plans for GPPMon GPPMON - GridPP monitor to be switched off SFT2 runs every 3 hours and sites/ROCS can run these tests independently, so there is no real need for these jobs. Proposal is to link GridPP monitoring map to the monitoring data in the R-GMA and make use of changes to the grid M/W e.g. support for longitude and latitude in Glue Schema (LCG 2.6). Google Map
EGEE03, April Google Map
EGEE03, April Accounting Overview This is a summary of the status of Accounting & Reporting following its deployment in LCG2_6 1. Overview 2. APEL Design 3. Whats New? 4. LCG Accounting (OSG, NorduGrid, EGEE) 5. Issues
EGEE03, April Requirement Capture Originally a requirement of the LHC Computing Grid project. Requirements were originally captured through presentations to LCGs Grid Deployment Board Deployment Team. LHC experiments and the Tier1 centres are represented on the GDB.
EGEE03, April Requirements A historical record of grid usage to identify the use of individual sites by VOs as a function of time To demonstrate the total delivery of resources by that site to the Grid Aggregated views of the collected data grouped by: Virtual Organisation Country – a requirement of LCG which has a country-based structure EGEE Region – for use by EGEE Regional Operations Centre (ROC) A presentation front-end to the data to allow the selection on-demand of the views described above for different VOs and periods of time. To present the data as A graphical view for interpretation A tabular view for precision To support sites that already had their own methods of data collection by allowing arbitrary data collection techniques and insertion of the data in the standard schema into the central database.
EGEE03, April Requirements It was not an explicit requirement that user information be captured but we included this in the design as we were sure this would be a secondary requirement This is a reporting system, not a charging mechanism. The information is under the control of the site, so it does not meet the requirement of a charging system to be digitally signed and irrefutable. Information is gathered centrally, not under the control of the VO
EGEE03, April Design Information collected at each site from batch logs, gatekeeper logs etc Information joined at site level to select grid jobs and stored in database on R-GMA MON box at site. Information published through R-GMA and collected centrally in an R-GMA archive at GOC Web site presents various views of this data for presentation Structure of Grid taken from GOC DB – the grid configuration database. Only normalised cpu time collected
EGEE03, April Accounting Flow Diagram
EGEE03, April How APEL Works? PBS/LSF log processed daily on site CE to extract required data, filter acts as R-GMA DBProducer -> PbsRecords table Gatekeeper log processed daily on site CE to extract required data, filter acts as R-GMA DBProducer -> GkRecords table Message log processed daily on site CE to extract required data, filter acts as R-GMA DBProducer -> MessageRecords table Site GIIS interrogated daily on site CE to obtain SpecInt and SpecFloat values for CE, acts as DBProducer -> SpecRecords table, one dated record per day These three tables joined daily on MON to produce LcgRecords table. As each record is produced program acts as StreamProducer to send the entries to the LcgRecords table on the GOC site. Site now has table containing its own accounting data; GOC has aggregated table over whole of LCG. Interactive and regular reports produced by site or at GOC site as required.
EGEE03, April Join Processor Mapping grid users to the resource usage on local farms
Job Records In via RGMA RGMA MON SQL QUERY TO Accounting Server 1 Query / Hour On-Demand Accounting Pages based on SQL queries to summary data 1 Record per Grid Job (Millions of records expected) Summary data refreshed every hour (Max records about 100K per year) Home Page User queries Graphs GOC Reporting Web Pages
Accounting Home Page 107 Sites publishing data (Sep ) Over 3.3 Million Job records ~ 100K records per week (period June 1 st – mid Aug 2005) /
EGEE03, April Whats New? Added GridPP View to the reporting interface Requirements driven by GridPP –Global view of entire organisation –Tier-2 Summaries –Detailed view at Site level –CSV download of information –Toggle between Normalised / Un-normalised Datasets
EGEE03, April GridPP Input GridPP Metrics and Deployment Document (J.Coles) Metric 10:Number of sites publishing accounting data at the end of the last quarter Metric 11:KSI2K hours of CPU processing delivered (per VO) over the last quarter We are looking for meaningful plots that allow important conclusions to be drawn without misleading people Is Job Efficiency meaningful? Sites treat their data in different ways:- At Tier-1 WCT are scaled because of the scheduler At other sites, only system time is scaled What about Hyper threading? Perhaps we need to provide descriptive text against each plot to warn of such problems? Spot potential problems in resource allocation Identify trends
GridPP View Screen Shots
Atlas and LHCb dominating KSI2K delivered per Tier1/Tier2 per VO Atlas dominates in Tier1 Job Efficiency = CPUT/WCT Why is atlas EFF at 60%? Why is DZERO EFF for MANHEP > 1 ?
Tier2 View (NorthGrid)
Site View (Lancaster) Breakdown of data per Vo per month showing Njobs, CPUt, WCT, record history Total CPU Usage per VO Gantt Chart NB:Gaps across all VOs consistent with scheduled downdowns in GocDB
EGEE03, April APEL IN LCG 2.6 New version with better documentation APEL supports PBS and LSF Consists of a number of components Core module contains functionality common to all components Plugin components provide log parsing functionality for PBS and LSF job managers.
EGEE03, April Monitoring Apel SFT Checks the Accounting Service Apel is not considered a critical service Two Step process Is archiver listening ? o GOC Flexible archiver service listens for accounting producers rgma -c "select count(*) from LcgRecords" If tests fail Archiver Down warning Has site published accounting data recently? o How many records published by site in the last day? rgma -c "select count(*) from LcgRecords where ExecutingCE=' ' and MeasurementDate>=' ' If tests fail Site Apel Down warning
EGEE03, April Issues 4. Which Log Files Should Site Administrators Backup? To build accounting records, we need to process data from THREE log file sources. This is a mandatory requirement in order to reconstruct what has been done during the 2004 period. /var/log/globus-gatekeeper* o Match between grid-user dn to GramScriptJobId /var/spool/pbs/server_priv/accounting/* o Local jobID and details of resources consumed o No distinction between grid jobs and non-grid jobs. /var/log/messages* o Map GramScriptJobID to local JobID This is how we separate grid jobs from local user jobs which run on the local fabric. If the site has deleted its messages files, we may be able to work around this by matching local unix groups in the batch logs. Accounting records formed in this way will not contain the dn of the grid-user.
EGEE03, April Issues 5. siteName Changes Recent problem with presenting data from the French ROC where CCIN2P3 was renamed to IN2P3-CC via GOCDB portal All records associated with the site are updated in order for SQL queries to match the new siteName. 6. Namespace Convention? Naming scheme to identify data belonging to large sites which provide services for different communities etc. NIKHEF: lcgprod.nikhef.nl, lcg2prod.nikhef.nl, edgapptb.nikhef.nl *SiteName* is a bad choice because we get multiple hits o *IC-LCG2* gives multiple matches PIC-LCG2 and IFIC-LCG2 Request sites stick to the convention *.SiteName o h1.desy.de, zeus.desy.de
EGEE03, April Issues 7. Normalisation We want to perform a reasonably sensible first order estimate to account for the differences in worker node performance. Homogeneous vs Heterogeneous PBS Job Records dont have any information about the worker node benchmarks, so we must insert one manually PBS Farms setup in different ways; can lead to an error in the normalisation calculation (Blindman vs internal normalisation) Histories - What SpecInts do we use in order to process archived Job Records? LSF Job Records have a CPU_FACTOR (1 - 4) in the Job Record. o What does a value of 1 correspond to? o Different calibration value at each site o Conversion table? o Can the site publish a weighted specInt2000 for the farm?
EGEE03, April Issues 8. Service Reliability & Hardening If flexible archiver is down, sites unable to publish data to GOC Update: /43 Apel core checks if flexible archiver service is available before attempting to publish data. GOC publishes a test record every 5 minutes to check the service is alive: automatic service recovery mechanism now in place Investigate running multiple flexible archiver services 1 per GOC or 1 per ROC? At the moment, the archiver service listens for all producers rather than producers belonging to a ROC. Single point of failure if registry is down? Multiple registry replicas supported in the RC1 (gLite) release? Update: Multiple registries supported in LCG2_4_0 ?
EGEE03, April Future Plans 1. Interoperate 2. CERN Courier / LCG News 3. Wiki Pages
EGEE03, April APEL and gLite Is APEL integrated in g-Lite? Work currently in progress. We have ported the APEL code into the gLite CVS repository but need to understand functional differences e.g. WMS and use of Condor What about its development plan? Future unclear given presence of DGAS in gLite Areas of possible development: Condor (easy or complicated) Reporting Tool (GridICE will most likely provide this)
EGEE03, April LCG Accounting Project involves combining results from all three infrastructures and presenting an aggregated view Peer Infrastructures in LCG Open Science Grid (Ruth Pordes, Philippe Canal, Matteo Melani) Nordugrid (Per Oster) EGEE Currently, LHCView filters LHC VO data from EGEE accounting data.
EGEE03, April Requirements Combine results from all three infrastructures … Ideally: Distributed queries to multiple databases Each peer manages an accounting database LHC VO filtering provided through a web services interface Initial Implementation: Centralised Collection Peers publish data into a global database WebServices or direct MySql inserts Common Problem: Different Grid infrastructures may use different Schemas. GGF define a schema, but quite flexible. May need translators to convert from one schema to another. (already exist)