Presentation is loading. Please wait.

Presentation is loading. Please wait.

Alerts and Monitors w/ SNMP

Similar presentations


Presentation on theme: "Alerts and Monitors w/ SNMP"— Presentation transcript:

1 Alerts and Monitors w/ SNMP
Symantec OpsCenter Alerts and Monitors w/ SNMP

2 Covered OpsCenter Monitoring SNMP Veritas Operations Manager Monitors
Other monitors: Command Central ???

3 Why Monitor? Notifications of “critical” failures
Aggregation of data (graphs, performance) Causal analysis (what failed when/where) Historical performance Recurring failures (e.g. “X on a Tuesday does this”)

4 What Should I Monitor? K.I.S.S. (to start)
Depends on agency requirements Critical services at the very least More monitoring means more load What constitutes a failure (or need for notification)? Some things are department-driven Director of Y needs to know “blah happened”.

5 SNMP v. Email Can use both
SNMP software can generate alerts, tickets, etc. SNMP software gathers all data in one place for performance analysis SNMP software can separate out into support groups Aggregation of data, alerts, and failures.

6 Which SNMP Software to Use
Try before buying (all have online trials) Know your learning curve Nagios (Nagmin, Icinga) Orion (SolarWinds) Net-SNMP (backend) OpenNMS

7 OpsCenter

8 OpsCenter

9 OpsCenter

10 OpsCenter

11 OpsCenter

12 OpsCenter Job High Job Failure Rate
An alert is generated when the job failure rate becomes more than the specified rate. Hung Job An alert is generated when a job for a selected policy/client hangs for a specified period. Job Finalized An alert is generated when a job of a specified type of the specified policy/client ends in the specified status. Incomplete Job An alert is generated when a job of a specified type of the specified policy/client moves to an incomplete state.

13 OpsCenter High Job Failure Rate

14 OpsCenter

15 OpsCenter Hung Job

16 OpsCenter Job Finalized

17 OpsCenter Media Frozen Media
An alert is generated when any of the selected media is frozen. Suspended Media An alert is generated when any of the selected media is suspended. Exceeded Max Media Mounts An alert is generated when a media exceeds the threshold number of mounts. Media Required for Restore An alert is generated when a restore operation is not running due to non-availability of media. Low Available Media An alert is generated when the number of available media becomes less than the present threshold value. High Suspended Media An alert is generated when the number of suspended media exceeds the predefined threshold value. High Frozen Media An alert is generated when the number of frozen media exceeds the predefined threshold value.

18 OpsCenter Catalog Catalog Space low
An alert is generated when the space available for catalogs becomes less than the threshold value. Catalog not Backed up An alert is generated when the catalog backup does not take place for a predefined time period. Catalog Backup Disabled An alert is generated when the catalog backup is disabled. Tape Mount Request An alert is generated when a media mount request is pending. No Cleaning Tape An alert is generated when no cleaning tapes are left. Zero Cleaning Left An alert is generated if a cleaning tape has zero cleaning left.

19 OpsCenter Disk Disk Pool Full
An alert is generated when one or more disk pools are full. Disk Volume Down An alert is generated when the selected disk volume(s) is down. Low Disk Volume Capacity An alert is generated when a disk volume capacity is running low. Drive is Down An alert is generated when a drive in a specified robot/media server in the selected view goes down. High Down Drives An alert is generated when the number of down drives exceeds the predefined threshold value.

20 OpsCenter Host Agent Server Communication Break
An alert is generated if communication between agent and server is broken. Master Server Unreachable An alert is generated when OpsCenter loses contact with the master server. Lost Contact with Media Server An alert is generated when OpsCenter loses contact with the media server. Service stopped An alert is generated when the selected services stop on any of the servers in the selected view. Symantec ThreatCon An alert is generated when the ThreatCon level is equal to or above the threshold value. Job Policy Change An alert is generated when one or more job policies change.

21 OpsCenter

22

23 OpenNMS Little configuration Fast learning curve
XML based (needs mib2opennms converter)

24 OpenNMS

25 OpenNMS

26 OpenNMS

27 OpenNMS Received unformatted enterprise event (enterprise: generic:6 specific:3). 12 args: =“nmsserver03" ="2457 Active Job Completed with Exit Status 58" ="Alert Raised on: May 6, :38 PM .Job: Tree Type : Server .Tree Name : ALL MASTER SERVERS .Nodes : hsmasterserver.Job Policy: Solaris_OS-Deduped .Exit Status: 58 (can' t connect to client ) .Client: hsappserver2 .New State: Done .Alert Policy: Job Failed .OpsCenter Server: vsf02 ." ="Job Failed" ="" ="" ="vsf02" ="" ="" ="" ="Warning" ="Sun May 06 18:38:04 MST 2012"

28 OpenNMS Need MIB /opt/SYMCOpsCenterServer/config/snmp/VERITAS-REG.mib
/opt/SYMCOpsCenterServer/config/snmp/VERITAS-TC.mib /opt/SYMCOpsCenterServer/config/snmp/VRTS-cc.mib /opt/SYMCOpsCenterServer/config/snmp/cc_trapd.conf

29 OpenNMS /opt/SYMCOpsCenterServer/config/snmp $ head /opt/SYMCOpsCenterServer/config/snmp/VRTS-cc.mib -- -- defines VERITAS-COMMAND-CENTRAL-MIB MIB -- Copyright (C) by VERITAS SOFTWARE Corporation. -- All rights reserved. VERITAS-COMMAND-CENTRAL-MIB DEFINITIONS ::= BEGIN

30 OpenNMS Command Central? I thought we were using OpsCenter

31 OpenNMS /opt/SYMCOpsCenterServer/config/snmp $ head /opt/SYMCOpsCenterServer/config/snmp/VRTS-cc.mib -- -- defines VERITAS-COMMAND-CENTRAL-MIB MIB -- Copyright (C) by VERITAS SOFTWARE Corporation. -- All rights reserved. VERITAS-COMMAND-CENTRAL-MIB DEFINITIONS ::= BEGIN

32 OpenNMS /opt/SYMCOpsCenterServer/config/snmp $ head /opt/SYMCOpsCenterServer/config/snmp/VRTS-cc.mib -- -- defines VERITAS-OPSCENTER-MIB MIB -- Copyright (C) by VERITAS SOFTWARE Corporation. -- All rights reserved. VERITAS-COMMAND-CENTRAL-MIB DEFINITIONS ::= BEGIN

33 OpenNMS Much Better!

34 OpenNMS -- Trap Variables ccTrapVarsGroup OBJECT-GROUP
OBJECTS { alertRecipients, alertSummary, alertDescription, policyName, objectType, collectorName, ccHost, sourceId, ccObject, sampleData, ccAlertSeverity, ccAlertTime } STATUS current DESCRIPTION "Group for CC Trap VarBinds" ::= { ccTrapDefinitionsBranch 100 } ccTrapVarsBranch OBJECT-IDENTITY STATUS current "Branch of the CC MIB for VarBind Definitions" ::= { ccTrapDefinitionsBranch 1 }

35 OpenNMS /opt/opennms/etc/events
Command Central: /opt/opennms/etc/events/VRTS-cc.events.xml OpsCenter: /opt/opennms/etc/events/VRTSopsctr.events.xml VOM: /opt/opennms/etc/events/VRTSsfm.events.xml

36 OpenNMS /opt/opennms/etc/events $ diff /opt/opennms/etc/events/VRTS-cc.events.xml /opt/opennms/etc/events/VRTSopsctr.events.xml 2c2 < <!-- Start of auto generated data from MIB: VERITAS-COMMAND-CENTRAL-MIB --> --- > <!-- Start of auto generated data from MIB: VERITAS-OPSCENTER-MIB --> 11c11 < <mevalue>0</mevalue> > <mevalue>6</mevalue> 18,19c18,19 < <uei>uei.opennms.org/mib2opennms/ccCritical</uei> < <event-label>VERITAS-COMMAND-CENTRAL-MIB defined trap event: ccCritical</event-label> > <uei>uei.opennms.org/mib2opennms/opsCritical</uei>

37 OpenNMS <events>
<!-- Start of auto generated data from MIB: VERITAS-OPSCENTER-MIB --> <event> <mask> <maskelement> <mename>id</mename> <mevalue> </mevalue> </maskelement> <mename>generic</mename> <mevalue>6</mevalue> <mename>specific</mename> <mevalue>1</mevalue> </mask> <uei>uei.opennms.org/mib2opennms/opsCritical</uei>

38 OpenNMS opsWarning trap received alertRecipients=vsf03p alertSummary=2587 Active Job Completed with Exit Status 40 alertDescription=Alert Raised on: May 25, :13 AM .Job: Tree Type : Server .Tree Name : ALL MASTER SERVERS .Nodes : wayback .Job Policy: Windows_OS-File-Users-Deduped .Exit Status: 40 (network connection broken) .Client: mfp01 .New State: Done .Alert Policy: Job Failed .OpsCenter Server: vsf02 . policyName=Job Failed objectType= collectorName= opsHost=vsf02 sourceId= opsObject= sampleData= opsAlertSeverity=Warning opsAlertTime=Fri May 25 01:13:34 MST 2012

39 Description A Warning alert trap from Symantec OpsCenter. alertRecipients hsphxvsf03p; alertSummary 2587 Active Job Completed with Exit Status 40; alertDescription Alert Raised on: May 25, :13 AM .Job: Tree Type : Server .Tree Name : ALL MASTER SERVERS .Nodes : wayback .Job Policy: Windows_OS-File-Users-Deduped .Exit Status: 40 (network connection broken) .Client: mfp01 .New State: Done .Alert Policy: Job Failed .OpsCenter Server: vsf02 .; policyName Job Failed; objectType ; collectorName opsHost hsphxvsf02; sourceId opsObject sampleData opsAlertSeverity Warning; opsAlertTime Fri May 25 01:13:34 MST 2012;

40 OpenNMS Set up Alerts

41 VOM No need to add recipients
Can call custom scripts in response to events notifications SNMP Can call commands in response to events

42 VOM

43 VOM Lots of options

44 VOM There are easier ways, but we’ll look at this anyway

45 VOM

46 VOM

47 VOM A Little Easier (maybe too late?)

48 VOM MIB /opt/VRTSsfmcs/config/snmp/VRTSsfm.mib

49 VOM sfmAlertVarGrp OBJECT-GROUP OBJECTS {alertTime, ruleName,
alertTopic, alertSeverity, alertSource, alertMessage, alertDescription, recommendedAction, classificationName, alertUserDefinedData} STATUS current DESCRIPTION "Trap Variable group" ::= { viptraps 3 }

50 VOM Major [+] [-] 6/13/12 16:00:46 [<] [>] 159.36.2.202 [+] [-]
6/13/12 16:00:46 [<] [>]  [+] [-] uei.opennms.org/vendor/symantec/traps/sfmAlertFS [+] [-] Edit notifications for event sfmAlertFS trap received alertTime= ruleName=High Usage on Filesystem alertTopic=event.alert.vom.vm.fs.highusage.warn alertSeverity=3 alertSource=hsdev11 alertMessage=File system /CIST/u02 violated high usage warn threshold alertDescription=File system /CIST/u02 violated high usage warn threshold recommendedAction=none classificationName=fs alertUserDefinedData={"fssize": ,"volsize": ,"mountpoint":"/CIST/u02","dgname":"CIST_dg","volname":"CIST_u02","fsid":"{6b356ea8-aa1c-11df-bc ee42}","fstype":"vxfs"}

51 VOM sfmAlertFS trap received alertTime= ruleName=High Usage on Filesystem alertTopic=event.alert.vom.vm.fs.highusage.warn alertSeverity=3 alertSource=dev11 alertMessage=File system /CIST/u02 violated high usage warn threshold alertDescription=File system /CIST/u02 violated high usage warn threshold recommendedAction=none classificationName=fs alertUserDefinedData={"fssize": ,"volsize": ,"mountpoint":"/CIST/u02","dgname":"CIST_dg","volname":"CIST_u02","fsid":"{6b356ea8-aa1c-11df-bc ee42}","fstype":"vxfs"}

52 VOM Description SFM alert FS alertTime 1339628438; ruleName
High Usage on Filesystem; alertTopic event.alert.vom.vm.fs.highusage.warn; alertSeverity 3; critical(1) error(2) warning(3) info(4) alertSource dev11; alertMessage File system /CIST/u02 violated high usage warn threshold; alertDescription recommendedAction none; classificationName fs; alertUserDefinedData {"fssize": ,"volsize": ,"mountpoint":"/CIST/u02","dgname":"CIST_dg","volname":"CIST_u02","fsid":"{6b356ea8-aa1c-11df-bc ee42}","fstype":"vxfs"};

53 Others Command Central

54 Others Command Central /opt/VRTSccs/VRTSamccs/SNMP/CC/VERITAS-REG.mib
/opt/VRTSccs/VRTSamccs/SNMP/CC/VERITAS-TC.mib /opt/VRTSccs/VRTSamccs/SNMP/CC/VRTS-cc.mib

55 Others Learn your alert reporting tools
Learn your SNMP software, even if you’re not the administrator of it.

56 Questions


Download ppt "Alerts and Monitors w/ SNMP"

Similar presentations


Ads by Google