Presentation is loading. Please wait.

Presentation is loading. Please wait.

Job monitoring and accounting data visualization

Similar presentations


Presentation on theme: "Job monitoring and accounting data visualization"— Presentation transcript:

1 Job monitoring and accounting data visualization
Enrico Fattibene INFN-CNAF enrico.fattibene<at>cnaf.infn.it Scuola per utenti INFN della Grid, Bologna, 28 Novembre 2007

2 Outline Grid Monitoring Grid Accounting GridICE HLRmon
Overview and architecture Job activity analysis Grid Accounting HLRmon Personal Grid usage analysis

3 Grid resource awareness
A large Grid system must provide its users with precise and reliable information about: Status and Usage of available resources The efficient distribution of this information enables VOs to: Optimize their utilization strategies How CPUs are distributed among sites What about the status of the Grid sites Complete the planned computations How long my jobs take for running in a site What about my jobs CPU/Wall time

4 What helps? Grid Monitoring tools help in the detection of:
Faulty Situations Status of available resources VO activity Usage of the available resources

5 EUIndiaGrid BalticGrid INFNGrid EELA BeGrid
GridICE: overview Distributed monitoring tool for Grid systems started in late 2002 (EU-DataTAG project) is evolving in the context of EU-EGEE and many other EU Grid projects 100% open source Fully integrated with the gLite Middleware Metering and publishing of data can be configured via gLite standard installation mechanisms Installed servers are monitoring Grid resources in the scope of: EGEE EGEE-SWE RDIG EGEE-SEE Grid.it GILDA CMS ATLAS EUMedGrid EUChinaGrid EUIndiaGrid BalticGrid INFNGrid EELA BeGrid

6 GridICE: overview Based on gLite Information System
Periodic discovery of new GRISes (once a day) Periodic queries to the discovered GRISes (every mins) Standard GRISes (CE, SE, Site BDII) Information published on Top BDII Extended GRIS (peculiar GridICE service) Hosts information (daemons monitoring) Job monitoring Summary info for computing resources from LRMS Information collected in a central DB on a server and shown in a Web interface Very useful help pages

7 GridICE: architecture

8 GridICE: added value Information provided include
Grid summary info Computing/Storage resources VO activity Job submission Provided information are accessible from Web context drill-down navigation XML documents Data exchange with other applications

9 Monitoring for different users
Users are required to have a valid CA certificate Users identification is done through the digital certificate installed in their browsers (DN retrieved) https secure protocol used on server side Two ways to access data: “Standard Users”: only the info of user’s own jobs are provided “High level Users” can ask to be registered to the GridICE web site with a specific role VO manager Site manager ROC (Regional Operation Centre) manager

10 Grid summary info Geographical composition Resources availability
Geo view where sites are located with the actual job load Resources availability Site view to get downtime info Site view to spot possible problems on Grid Information Service Grid services running on host machines Resources inventory VO view where computing and storage resources are aggregated per-VO

11 End-user activity /1 Job section to track VO users activity in order to: Search among a huge number of jobs Inspect jobs resource consumption Personal jobs info (next release)

12 End-user activity /2

13 End-user activity /3

14 Ongoing and future work
Data quality Recent data quality analysis (performed in production environment) confirmed the expected level of correctness Data access Cooperation with external applications Integration of GridICE data into Experiment dashboard Batch System data genereted by the job monitoring sensors Web redesign Next user experience will gain from Web 2.0 principles, guidelines and technologies

15 Accounting Need to know who used the resources and how many resource have been used ROC Managers: how the Grid resources are used and by whom? Site Managers: who used my resources? VO managers: how many resources my VO used? Users: how many resources I used? A good accounting system should provide answers to these questions taking care of all the security and privacy issues

16 HLRmon: overview Signed access Grid role based 4 different roles
Authorization/authentication by user’s digital certificate Grid role based Proper information are provided conforming to the role scope 4 different roles ROC manager, VO manager, Site admin, VO user Local aggregation Daily aggregated activity is locally stored Graphical or textual Job CPU/WallTime usage per Site and VO

17 HLRmon: architecture

18 HLRmon: data presentation
Report data aggregated per site/VO/day Charts created on user needs Possibility to enlarge graphs (next release) Interactive table with possibility to export excel format Information about Grid utilization An user can see jobs submitted by users in his visibility scope

19 4 different roles ROC Manager Site Manager VO Manager VO User
Report on all sites and VO that used the Grid Information on resources usage by Grid users Site Manager Report on all VOs that used the site Information on all users jobs VO Manager Report on usage on all the sites accepting the VO Information on Grid usage by VO members VO User Report his own resources usage

20 VO-manager viewpoint Report on aggregated job activity in different formats Graphs JobsNum/Site CPUTime/Day WallTime/Day CPUTime/Site WallTime/Site Table End user jobs detailed info

21 End-user viewpoint Report on personal job activity in different formats and aggregation Graphs JobsNum/Site CPUTime/Site WallTime/Site CPUTime/Day WallTime/Day JobsNum?VO Table Jobs number, CPUTime and WallTime per Site and VO

22 Conclusions GridICE and HLRmon can show you the relevant info you are interested in With the authentication based on personal certificate, the data privacy is always guaranteed GridICE and HLRmon Web presentation can be accessed by VO end-users to obtain information related to Grid resources usage and availability

23 References GridICE dissemination Web Site http://grid.infn.it/gridice
GridICE server for Italian Grid HLRmon for Italian Grid W3C Standards - evergreen hint

24 Disclaimer This presentation is based on materials provided and authorized by the EGEE project and is freely available to download and use according to the terms of the following license:


Download ppt "Job monitoring and accounting data visualization"

Similar presentations


Ads by Google