Presentation is loading. Please wait.

Presentation is loading. Please wait.

LNL CMS M.Biasotto, Padova, 24 aprile 2002 1 Farm monitoring Massimo Biasotto - LNL.

Similar presentations


Presentation on theme: "LNL CMS M.Biasotto, Padova, 24 aprile 2002 1 Farm monitoring Massimo Biasotto - LNL."— Presentation transcript:

1 LNL CMS M.Biasotto, Padova, 24 aprile Farm monitoring Massimo Biasotto - LNL

2 LNL CMS M.Biasotto, Padova, 24 aprile Local Farm Monitoring LNL experiences with local farm monitoring LNL experiences with local farm monitoring July 2001, we first started with MRTG: a lot of problems July 2001, we first started with MRTG: a lot of problems –heavy footprint on the server –unreliable (processes hanging with unreachable hosts) –not scalable November 2001, remstats: improvements November 2001, remstats: improvements –lighter and more robust than MRTG –more flexibility in graph display and alarm management –still scalability problems (it works in sequential mode)

3 LNL CMS M.Biasotto, Padova, 24 aprile Remstats example

4 LNL CMS M.Biasotto, Padova, 24 aprile Remstats example

5 LNL CMS M.Biasotto, Padova, 24 aprile Remstats vs MRTG

6 LNL CMS M.Biasotto, Padova, 24 aprile Ganglia March 2001, ganglia: many advantages March 2001, ganglia: many advantages –much greater resolution: metrics sampled every 15 sec instead of 5 min –scalability: based on a distributed architecture, with data exchange via multicast channel –single host metrics easily integrated to produce cumulative overview graphs –there is still need to customize the tool (adding more metrics, customizing web pages, etc)

7 LNL CMS M.Biasotto, Padova, 24 aprile

8 LNL CMS M.Biasotto, Padova, 24 aprile Ganglia example

9 LNL CMS M.Biasotto, Padova, 24 aprile Ganglia example

10 LNL CMS M.Biasotto, Padova, 24 aprile Ganglia example

11 LNL CMS M.Biasotto, Padova, 24 aprile Netsaint During our survey of the existing monitoring tools, Netsaint was considered and discarded During our survey of the existing monitoring tools, Netsaint was considered and discarded –Main reason: it didnt monitor host performance metrics, like % cpu, load, network traffic, etc. (at least, not without heavy customization). Maybe now the necessary plugins have been added. –It didnt have a database to record the historical data It monitors the status of the hosts (up or down) and of some network services It monitors the status of the hosts (up or down) and of some network services It provides a log of all relevant events (hosts/services going up or down, etc.) It provides a log of all relevant events (hosts/services going up or down, etc.) Probably other features, but Ive never investigated the tool deeply Probably other features, but Ive never investigated the tool deeply

12 LNL CMS M.Biasotto, Padova, 24 aprile Grid monitoring Grid monitoring is different than local farm monitoring Grid monitoring is different than local farm monitoring –you cannot monitor on a WAN all the performance metrics of all the farm nodes (and you probably dont want to) Currently, Netsaint is used on DataGrid Testbed to monitor the status of the testbed nodes and their grid services Currently, Netsaint is used on DataGrid Testbed to monitor the status of the testbed nodes and their grid services (infn-tb/guest) (infn-tb/guest) Is this useful for CMS? Is this useful for CMS? Can other useful features be added? Can other useful features be added?

13 LNL CMS M.Biasotto, Padova, 24 aprile Netsaint example

14 LNL CMS M.Biasotto, Padova, 24 aprile Adapting CMS monitoring to Grid What are the CMS requirements for Grid monitoring? What are the CMS requirements for Grid monitoring? What do we want to monitor and why? What do we want to monitor and why? Once these questions have been addressed, we can decide if Netsaint fulfills the requirements Once these questions have been addressed, we can decide if Netsaint fulfills the requirements Integrating Netsaint into existing CMS farms shouldnt be difficult Integrating Netsaint into existing CMS farms shouldnt be difficult –the main issue is probably the setup (and maintenance) of the central repository But it should be done only if there is a real need, not just for the sake of it But it should be done only if there is a real need, not just for the sake of it


Download ppt "LNL CMS M.Biasotto, Padova, 24 aprile 2002 1 Farm monitoring Massimo Biasotto - LNL."

Similar presentations


Ads by Google