Brief overview on GridICE and Ticketing System Giuseppe Misurelli giuseppe.misurelli <at> cnaf.infn.it LCG/gLite User Tutorial Bologna, 02 March 2006
Goals Get an idea on how is important the Grid monitoring activity Use of the GridICE monitoring tool to understand your Grid composition in terms of resources Knowledge of the User and Virtual Organization support infrastructure adopted in EGEE LCG/gLite User Tutorial Bologna, 03 March 2006
Monitoring a Grid Grid Monitoring sould provide: The knowledge of the type, state and features of the resources constituting the Grid by means of: Grid Resouces Inventory Grid Resources Behavior Grid Resources Availability LCG/gLite User Tutorial Bologna, 03 March 2006
Grid Resources Inventory Instantaneous picture of the Grid resources to have an idea on how they are shared among sites Number of Computing Element (CE), Worker Nodes (WN) and Storage Element (SE) Number of Jobs running and waiting in all the Grid, for a particular Virtual Organization (VO) LCG/gLite User Tutorial Bologna, 03 March 2006
Grid Resources Behavior Measuring a set of evolving data to investigate historical/statistical aspects of a Grid Percentage of jobs aborted in a site for a particular VO in a certain period of time Time duration of a fault situation for a particular grid service LCG/gLite User Tutorial Bologna, 03 March 2006
Grid Resources Availability Evaluating the accessibility of the Grid main services at Regional, Site and VO level for a grid usage improvement Actual Grid services down (e.g. CE, WN, SE) Actual Jobs load in a certain Site Actual Min/Max Sloat Free where you can submit jobs LCG/gLite User Tutorial Bologna, 03 March 2006
The GridICE tool Brief Introduction GridICE: Is a distributed monitoring tool for grid systems Integrates with local monitoring system Fully integrated in the LCG-2 Middleware (Monitoring for free) Offers a web interface for publishing monitoring data at the Grid level LCG/gLite User Tutorial Bologna, 03 March 2006
Monitoring the GILDA Grid GridICE server installed and configured to highlight the monitoring of the GILDA Grid for dissemination activities (http://alifarm7.ct.infn.it:50080/gridice) LCG/gLite User Tutorial Bologna, 03 March 2006
GridICE Utilization /1 General picture of your Grid How many sites compose the Grid and where they are located How many resources are available (e.g. CPU#, WN) LCG/gLite User Tutorial Bologna, 03 March 2006
GridICE utilization /2 Resources available for a selected VO (e.g. gilda) Computing Elements where users can submitt jobs Storage Elements where users can store/retrieve data LCG/gLite User Tutorial Bologna, 03 March 2006
User Support in EGEE Help request LCG/gLite User Tutorial Bologna, 03 March 2006
EGEE Support Infrastructure Wherever located and whatever the problem is, a user exepects from a support infrastructure a set of services: Few access point for support A portal with a well structured sources of information and documentation Correct, complete and responsive support Assistance during production use of the Grid Integrated interface with local support systems Users: Site admin, VO application, VO manager They are different but with a lot of overlap LCG/gLite User Tutorial Bologna, 03 March 2006
The GGUS Solution Global Grid User Support (GGUS) infrastructure attemps to meet most of the actual users’ support expectations: It is able to process up to 200 requests per day Regional support with a central coordination Users can submit a support request to: Central GGUS helpdesk Regional Operation Center (ROC) helpdesk LCG/gLite User Tutorial Bologna, 03 March 2006
Central GGUS Helpdesk Acts as a portal for all users who don’t know where to send their requests All the requests could be entered directly in the GGUS system via: Ticketing System (Web Forms submission at www.ggus.org) Contacting mailing lists <your_vo_name>-user-support@ggus.org helpdesk@ggus.org LCG/gLite User Tutorial Bologna, 03 March 2006
Support Workflow (steps) Every request generates a ticket creation First line problem analisys from: ROC experts called generic Ticket Processing Managers (TPM) VO experts called VO Ticket Processing Managers (TPM) TPM experts provide a solution or escalate it to more specialized support unit (e.g. network, middleware) Tickets are followed making sure that users receive an adequate answer: Possibility to examine the evolving state Notification of solved tickets (closed tickets) LCG/gLite User Tutorial Bologna, 03 March 2006
Support Workflow (picture) For VO users and VO specific problems Help request - Solves - Classifies - Monitors Ticket Creation TPM Grid+VO experts VO-specific Central Application (GGUS) VO Support Units Middleware Support Units Deployment Operations Support ROC Network LCG/gLite User Tutorial Bologna, 03 March 2006
The Italian ROC helpdesk Acts as a portal for all users who don’t know where to send their requests Fast helpdesk for the Italian Grid related problems Integrated interface with GGUS Access allowed to registered members at https://grid-it.cnaf.infn.it/checklist VO application users: create tickets describing problems or suggestions Supporters: fix problem related to local Grid sites or redirect them to the central GGUS helpdesk Site Admins: support for a given regional center or GGUS ticket LCG/gLite User Tutorial Bologna, 03 March 2006
References GridICE EGGE Ticketing System Dissemination web site http://grid.infn.it/gridice GridICE server for LCG-Grid http://gridice2.cnaf.infn.it:50080/gridice GridICE server for INFNGrid http://gridice4.cnaf.infn.it:50080/gridice EGGE Ticketing System GGUS web portal www,ggus.org Italian ROC web portal https://grid-it.cnaf.infn.it/checklist LCG/gLite User Tutorial Bologna, 03 March 2006