Update of SAM Implementation ALICE TF Meeting 18/10/07.

Slides:



Advertisements
Similar presentations
Die Kooperation von Forschungszentrum Karlsruhe GmbH und Universität Karlsruhe (TH) Tools used for operations at GridKa Angela Poschlad, SCC.
Advertisements

Lectures on File Management
IRecruitment Support Model 15 th October iRec Home Page Search for Job Submit application View job, open all attachments (job description etc.)
A Computation Management Agent for Multi-Institutional Grids
Slide 1 of 9 Presenting 24x7 Scheduler The art of computer automation Press PageDown key or click to advance.
Cummins® Inc. Update Manager 3.0 Training Electronic Service Tools.
Cummins® Inc. Update Manager 3.1 Training Electronic Service Tools.
Today’s Agenda Chapter 12 Admin Tasks Chapter 13 Automating Admin Tasks.
CERN IT Department CH-1211 Genève 23 Switzerland t Some Hints for “Best Practice” Regarding VO Boxes Running Critical Services and Real Use-cases.
Summary of issues and questions raised. FTS workshop for experiment integrators Summary of use  Generally positive response on current state!  Now the.
G RID SERVICES IP V 6 READINESS
CERN IT Department CH-1211 Genève 23 Switzerland t EIS section review of recent activities Harry Renshall Andrea Sciabà IT-GS group meeting.
SEE-GRID-SCI SEE-GRID-SCI Operations Procedures and Tools Antun Balaz Institute of Physics Belgrade, Serbia The SEE-GRID-SCI.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Julia Andreeva CERN (IT/GS) CHEP 2009, March 2009, Prague New job monitoring strategy.
INFSO-RI Enabling Grids for E-sciencE Workload Management System Mike Mineter
Learningcomputer.com SQL Server 2008 – Administration, Maintenance and Job Automation.
©2002 Allen Systems Group, Inc. All Rights Reserved. by Scott Webb, ASG Senior Sales Engineer by Scott Webb, ASG Senior Sales Engineer ASG-sys*ADMIRAL.
02/07/09 1 WLCG NAGIOS Kashif Mohammad Deputy Technical Co-ordinator (South Grid) University of Oxford.
Status of the production and news about Nagios ALICE TF Meeting 22/07/2010.
1 / 22 AliRoot and AliEn Build Integration and Testing System.
Module 15 Monitoring SQL Server 2008 R2 with Alerts and Notifications.
Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.
SAN DIEGO SUPERCOMPUTER CENTER Inca TeraGrid Status Kate Ericson November 2, 2006.
1 Chapter Overview Defining Operators Creating Jobs Configuring Alerts Creating a Database Maintenance Plan Creating Multiserver Jobs.
DataGRID Testbed Enlargement EDG Retreat Chavannes, august 2002 Fabio HERNANDEZ
Enabling Grids for E-sciencE INFSO-RI Tools for CIC Operations, Bologna, 24th May Monitoring workflow in EGEE GOC DB is used to get the list.
Monitoring with MonALISA Costin Grigoras. What is MonALISA ?  Caltech project started in 2002
Automated Grid Monitoring for LHCb Experiment through HammerCloud Bradley Dice Valentina Mancinelli.
Tbox is a monitoring solution for all your computer systems Unifies and simplifies management of system surveillance Notifies you in the event of.
8 th CIC on Duty meeting Krakow /2006 Enabling Grids for E-sciencE Feedback from SEE first COD shift Emanoil Atanassov Todor Gurov.
Site Validation Session Report Co-Chairs: Piotr Nyczyk, CERN IT/GD Leigh Grundhoefer, IU / OSG Notes from Judy Novak WLCG-OSG-EGEE Workshop CERN, June.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
AliEn central services Costin Grigoras. Hardware overview  27 machines  Mix of SLC4, SLC5, Ubuntu 8.04, 8.10, 9.04  100 cores  20 KVA UPSs  2 * 1Gbps.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI How to integrate portals with the EGI monitoring system Dusan Vudragovic.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Ops Portal New Requirements.
Vendredi 19 février 2016 CIC portal development status and TODO list Gilles Mathieu, Osman Aidel, Cyril L’Orphelin IN2P3/CNRS Computing Centre, Lyon, France.
Global ADC Job Monitoring Laura Sargsyan (YerPhI).
FTS monitoring work WLCG service reliability workshop November 2007 Alexander Uzhinskiy Andrey Nechaevskiy.
Daniele Spiga PerugiaCMS Italia 14 Feb ’07 Napoli1 CRAB status and next evolution Daniele Spiga University & INFN Perugia On behalf of CRAB Team.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The LCG interface Stefano BAGNASCO INFN Torino.
Patricia Méndez Lorenzo (CERN, IT/GS-EIS) ċ. Introduction  Welcome to the first ALICE T1/T2 tutorial  Delivered for site admins and regional experts.
SAM Database and relation with GridView Piotr Nyczyk SAM Review CERN, 2007.
Christmas running post- mortem (Part III) ALICE TF Meeting 15/01/09.
03/09/2007http://pcalimonitor.cern.ch/1 Monitoring in ALICE Costin Grigoras 03/09/2007 WLCG Meeting, CHEP.
The GridPP DIRAC project DIRAC for non-LHC communities.
TF meeting – July 13, 2006 Support for taking actions in MonALISA Costin Grigoras.
WMS baseline issues in Atlas Miguel Branco Alessandro De Salvo Outline  The Atlas Production System  WMS baseline issues in Atlas.
SAM Status Update Piotr Nyczyk LCG Management Board CERN, 5 June 2007.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Operations Portal Development Update on Requirements Cyril L'Orphelin IN2P3/CNRS.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI User Communities Requirements for Technology Providers Steve Brewer,
CREAM CE: upgrades in the system  Migration of the ALICE production queue in the CREAM CE: DONE  From pps-cream-fzk.gridka.de:8443/cream-pbs-pps to.
II EGEE conference Den Haag November, ROC-CIC status in Italy
SEE-GRID-SCI Grid Operations Procedures Antun Balaz Institute of Physics Belgrade Serbia The SEE-GRID-SCI initiative.
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
INFSO-RI Enabling Grids for E-sciencE GOCDB Requirements John Gordon, STFC.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The Dashboard for Operations Cyril L’Orphelin.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CYFRONET site report Marcin Radecki CYFRONET.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Author etc Alarm framework requirements Andrea Sciabà Tony Wildish.
New OSG Virtual Organization Security Training OSG Security Team.
Key Activities. MND sections
ALICE Monitoring
Evolution of SAM in an enhanced model for monitoring the WLCG grid
Patricia Méndez Lorenzo ALICE Offline Week CERN, 13th July 2007
Lavoisier : a way to integrate heteregeneous monitoring systems.
1 VO User Team Alarm Total ALICE ATLAS CMS
Take the summary from the table on
Publishing ALICE data & CVMFS infrastructure monitoring
SAG Infotech Private Limited
Kashif Mohammad Deputy Technical Co-ordinator (South Grid) Oxford
EGEE Operation Tools and Procedures
Presentation transcript:

Update of SAM Implementation ALICE TF Meeting 18/10/07

Mayor Upgrades New alarm system –SAM –ML Larger verbosity in the SAM messages in case of errors Review of the code Review of the sites New implementation in ML

New Alarm System SAM implementation –The SMS notification procedure has been eliminated –For the notification, all support lists provided by the GOC DB have been included All sites have now notifications in case of errors –The message in the s have to be improved including the link to the specific test SAM developers working on (end of October)‏ –The list of sites in scheduled downtime will also be provided to avoid the failure notification Ready today although not yet certified MonaLisa implementation –See Costin`s comments

Upgrades in the test suite The error messages provided by the SAM interfaced are now more detailed –It also includes the command that fails just to make a cut and paste of the code for testing purposes The User Proxy Registration not working. Failed to execute vobox-proxy --vo alice --force. The User is not allowed to register his proxy within the VOBOX Upgrade of the WMS test created by Stefano Bagnasco –Checks the status of the submitted job agents Information system Too large number of aborted, scheduled and waiting jobs The test fails if the information is not available in a file: $HOME/.alien/SAMTestCache Most probably the query to the IS fails Update of the software area test –Touch and rm of a file in that area

Review of the sites Several sites were not published in SAM The VOBOXES were not registered in the GOC DB or the monitoring flag was set to “NO” This has been changed by all sites but still it seems not to be enough....

ML interface

Test history

Last test result from SAM site

General site availability

Alerts published as RSS feed, and toolbar notifications