Presentation is loading. Please wait.

Presentation is loading. Please wait.

John Gordon Grid Accounting Update John Gordon (for Dave Kant) CCLRC e-Science Centre, UK LCG Grid Deployment Board NIKHEF, October.

Similar presentations


Presentation on theme: "John Gordon Grid Accounting Update John Gordon (for Dave Kant) CCLRC e-Science Centre, UK LCG Grid Deployment Board NIKHEF, October."— Presentation transcript:

1 John Gordon j.c.gordon@rl.ac.uk Grid Accounting Update John Gordon (for Dave Kant) CCLRC e-Science Centre, UK LCG Grid Deployment Board NIKHEF, October 2004

2 Presenter Name Facility Name Accounting An accounting package for LCG has been developed by the GOC at RAL There are two main parts –the accounting data-gathering infrastructure based on R-GMA which brings the data to a central point –a web portal to allow on-demand reports for a variety of players.

3 Presenter Name Facility Name Batch Log GK Log messages filter CE Site GIIS LCG SITE Data Sources MON RGMARGMA GOC Site MON RAW Accounting Data Data Aggregation per VO per ROC Accounting Service On Demand Reports Accounting Flow Diagram

4 Presenter Name Facility Name 1.Gatekeeper Records contain DN, GramScriptJobID and the manager type (lcgpbs, fork, lcglsf). Gatekeeper logs are used to distinguish jobs that are submitted through the grid (grid jobs) from jobs submitted locally (non-grid jobs) on the fabric. 2.Messages logs contain mappings between GramScriptJobID and LocalJobID of Batch System. Batch Logs do not distinguish between grid jobs and non-grid jobs. 3.Batch Logs: “E” (PBS) or “JOB_FIINISH” (LSF) and LocalJobID, LocalUser, LocalGroup, StartTime, StopTime, ExecutingHost, CPUTime, MemoryUsage, Exit Status, … Accounting Information

5 Presenter Name Facility Name Accounting Issues 1.Accounting suit requires R-GMA infrastructure. Each site is required to install an R-GMA MON node where local site accounting information is stored. It is not recommended that sites share MON boxes as this kind of setup is complicated. 2.Batch systems supported are PBS (lcgpbs, pbspro, Vanilla pbs, openpbs, torque) and BQS. These cover over 95% of all job managers in LCG. We are working to support LSF but have problems mostly with the variable format of the batch records and the need to identify fields using regular expressions. A common batch log record would simplify this task. 3.We need to process batch logs, gatekeeper logs and system messages to build a full accounting record. Most sites throw away messages after 9 weeks due to the log rotator. Without messages, we cannot map the grid DN in the GK records to the local batch jobs. 4.The VO associated with a user’s DN is not available in the batch or gatekeeper logs. It will be assumed that the group ID used to execute user jobs, which is available, is the same as the VO name. This needs to be acknowledged as an LCG requirement. We believe that use of VOMS proxies would solve this.

6 Presenter Name Facility Name Accounting Issues 5.The global jobID assigned by the Resource Broker is not available in the batch or gatekeeper logs. This global jobID cannot therefore appear in the accounting reports. The RB Events Database contains this, but that is not accessible nor is it designed to be easily processed. 6.At present the logs provide no means of distinguishing sub-clusters of a CE which have nodes of differing processing power. Changes to the information logged by the batch system will be required before such heterogeneous sites can be accounted properly. At present it is believed all sites are homogeneous. 7.The information from the gatekeeper, messages and batch logs must be joined to build a full accounting record for grid jobs. We reported to LCG that join performance was poor. However, after optimisation this process takes seconds (without optimisation, database joins can take hours!).

7 Presenter Name Facility Name GOC Accounting Services http://goc.grid-support.ac.uk/gridsite/accounting/index.html BaseCpuSeconds Aggregated across EGEE Each Site, per VO, per Month Simple interface to customise views of data: VO, time frame and Region (default = EGEE) Each Region, per VO, per Month On Demand Services to EGEE Community Other Distributions Normalised CPU # Jobs

8 Presenter Name Facility Name Accounting Release Dates 1.GOC Accounting web pages under development. Accounting service provides BaseCpuSeconds views per site, per ROC, per VO, per month. http://goc.grid-support.ac.uk/gridsite/accounting/index.html Data available as a csv dump. Demo 2.Package sent to C&T team in August 2004 (Zdenek Sekera, Di Qing). We have been informed that accounting will be released with the SLC3 bundle.

9 Presenter Name Facility Name Summary Accounting Information gathering infrastructure has been developed It has been through the C&T cycle and should be deployed in the next release. A web portal for display of this information has been developed –and will continue to be developed in the light of feedback This is an EGEE deliverable (DSA1.3) The display infrastructure can be deployed for other information. –See monitoring talk


Download ppt "John Gordon Grid Accounting Update John Gordon (for Dave Kant) CCLRC e-Science Centre, UK LCG Grid Deployment Board NIKHEF, October."

Similar presentations


Ads by Google