Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mark Jones Senior Product Manager How Automation Can Help You: Use Cases for NetIQ Aegis™

Similar presentations


Presentation on theme: "Mark Jones Senior Product Manager How Automation Can Help You: Use Cases for NetIQ Aegis™"— Presentation transcript:

1

2 Mark Jones Senior Product Manager How Automation Can Help You: Use Cases for NetIQ Aegis™

3 Our Vision For IT Process Automation 3 Years In The Making 3 years ago NetIQ had a vision for converging our systems & security management products to support consolidated incident & event handling. But customers said, help us connect to our other tools as well. We’re the Noah’s Ark of tools – we have two of everything. VP of Operations at a major Financial Institution So we altered our plan to give customers greater control of the tools they’ve already invested in by creating a strategy for heterogeneous IT Process Automation (ITPA).

4 Introducing NetIQ ® Aegis™ The Control & Automation Platform for IT Processes NetIQ Aegis is a software platform that models, automates, measures and improves run books and ITIL-based processes, bringing control and automation to IT Operations ITIL Process (macro) Run Books (micro ) Automate Model Measure Improve

5 Use Case #1 Sympathetic Event Correlation NetIQ Aegis 4. AppManager receives sympathetic access failure events From application and web servers 5. Aegis’ correlation engine sees the sympathetic events And matches them to pre- defined rules 2. AppManager receives event From the agent on the server 1. SQL Server down event 6. Aegis closes the sympathetic events Reducing the volume of AppManager events to be dealt with Update comments in the original event accordingly 3. AppManager event triggers an Aegis workflow Correlation engine begins listening for sympathetic events that match rules NetIQ AppManager Database Server Web Server Application Server ! ! ! Additional correlation examples: Suppress machine down events from hosts on attached subnets when a router fails Identify root cause from multiple events, e.g. a congested network segment identified by a combination of Network ResponseTime events, and high queue lengths on some Exchange servers 1 2 3 4 5 6

6 Use Case #2 Managing Maintenance Modes NetIQ Aegis 4. Aegis sets the maintenance mode in AppManager On the right machine at the right time 6. Aegis’ sends a reminder email before the expiration of maintenance With an opportunity to “snooze” or extend via email 2. Aegis receives the email and parses Identifies the resource to set maintenance mode on and the time window 1.Application owner sends an email request to set maintenance mode Using an Outlook form 7. Aegis stops maintenance mode On time with no further approval 3. Aegis sends a reminder email before the start of maintenance With an opportunity to cancel via email NetIQ AppManager 5. Administrator performs maintenance Application Owner Outlook Form 1 2 3 4 6 5 7 8. Aegis sends email confirming maintenance stoppage 8

7 Use Case #3 Low Disk Space Response 3.Aegis requests disk usage analysis from AppManager Identify top N culprits by folder, file type, age Extra attention on known temp file storage areas 4.Aegis sends email to admin requesting approval to clean up Embed results of disk usage analysis & link to Aegis web site 2.AppManager detects condition AppManager Knowledge Script generates event 1.Available disk space falls below threshold Likely caused by temp file growth 5.Administrator approves partial cleanup through Aegis (or by replying to email) Admin can select individual folders or file types for deletion, archiving or user attention 6.Aegis commands AM to perform cleanup Delete approved files and analyze new disk space status 7.Aegis sends confirmation email to admin Identify files deleted and new disk space status NetIQAppManager NetIQ Aegis Admin AppManager Agent Archive Trash 1 2 3 4 5 6 7

8 Use Case #4 VM Dynamic Performance Management NetIQ Aegis 9. Verify improved service performance Repeat as necessary for up to 3 new guests total 4. Provision new VM guest Clone VM, configure LAN settings, etc & boot 5. Apply post-image updates per corp standard Patches, configuration updates since VM image was created 2. Identify VM host with spare capacity 1. Detect poor performance on VM-hosted service Performance problem detected by AppManager ResponseTime 6. Configure applications Machine-specific settings required on guest and other machines in business service 7. Validate application function Verify proper application function before bringing into production 8. Bring new guest into production rotation Configure load balancer, application controller or similar VMWare Virtual Center Attachmate WinInstall Load Balancer or Controller VMware ESX Hosts 3. Gain approval to provision new VMs Send email to admin with proposed changes, requesting approval to automatically respond NetIQAppManager Admin Critical Business Service 1 2 3 4 5 6 7 8 9

9 Use Case #5 Web Server Sequential Restart 3. Aegis blocks new sessions to first server Uses NetIQ AppManager to configure load balancer 4. Aegis commands AppManager to monitor for server to reach zero active sessions Users “bleed” off as they end their sessions on their own; AppManager sends event when zero session remain 2. Admin initiates “Restart Web Farm” Runbook Customized runbook automated by Aegis 1. Admin applies a patch to all web servers Reboot needed to finalize 5. Aegis commands AppManager to restart the web server Aegis waits for notification that reboot is complete 6. Aegis commands AM to test basic functionality Verify that web server properly performs expected duties 7. Aegis enables new sessions to the server Uses NetIQ AppManager to configure load balancer NetIQAppManager NetIQ Aegis Admin AttachmateWinINSTALL Active Sessions Web Servers Load Balancer 8. Aegis verifies web site health Users are accessing the rebooted server successfully and no Response Time or other errors reported on the web farm 9. Send progress notification to Admin Include % remaining & ETA for completion 10. Go to Step 3 for next server Iterate until all servers completed 1 2 3 4 5 6 7 8 9 10

10 Use Case #6 Incident Management Other Sources (RFCs, CMDB, NetIQ Change Guardian, etc.) 3.Create helpdesk ticket Apply proper classifications Embed link to web page with related incidents 4.Helpdesk staff works ticket Relevant information already collected & presented with ticket 2.Collect related events from other data sources Changes, tickets, intrusions, etc during same time period Broaden scope to other machines in business service and correlate 1.Incident occurs Performance problem detected by AppManager ResponseTime 5.Monitor existing incident management workflow Support ticketing workflow with Aegis Investigation Assistance Wait for ticket to be resolved (not closed) 6.Initiate Incident Probation Period Verify proper service restoration, record in ticket Search all tools for unanticipated downstream impacts, reopen ticket if found 7.Coordinate post-incident review for Problem Management Request explanatory info from stakeholders, e.g. how well was incident handled, how to prevent recurrence Produce unified report for management NetIQ AppManager Helpdesk NetIQ Aegis Incident Stakeholder s Management Ticketing 1 2 3 4 5 6 7

11 Use Case #7 Change Management AppManager 8. Correlate changes to impacts Search other tools for downstream impacts from change such as performance problems, new vulnerabilities, etc. All Data Sources (Net. Mgmt, Etc) 4. Change Requester executes change per approved ticket Actions bounded by change control tool 1. Change is requested & approved via existing “Request for Change” process 6. Reconcile audited changes to the approved RFC Group audited changes by time, machine, individual Request review of changes: auth or unauth, relevant ticket ID, etc Update ticket and CMDB with related changes 7. Perform system health check After change, verify proper service levels “Request for Change” Process NetIQ Aegis Change Requester Management 9. Coordinate Post-Change Review Change is “completed” but not “closed” until the CAB has completed review Tripwire, NetIQ Change Guardian, etc Administrator 2. Detect approved change request Monitor Remedy or other change management system 5. Change audit tool detects actual config changes Tripwire or NetIQ Change Guardian NetIQ Change Administrato r CMDB 3. Provision access in change control tool Managed by NetIQ Change Administrator Incident Stakeholders 1 2 3 4 5 6 7 8 9

12 Use Case #8 Vulnerability Management 8. Relate changes to impacts Search other tools for downstream impacts from change such as performance problems, new vulnerabilities, etc. All Data Sources (VM, SM, Etc) 3. Request permission to remediate via existing Change Management process (RFC) Group by machine, service, vulnerability class, etc. 1. Initiate vulnerability & policy violation scan Or scan on an existing schedule 5. Initiate remediation Using provisioning tools such as WinINSTALL, SMS, etc. or by assigned administrator 7. Perform system health check After change, verify that remediation did not impact service levels AppManager Remedy NetIQ Aegis Secure Configuration Manager Administrator 2. Identify resulting vulnerabilities 4. Monitor for approved RFC Patch Manager, WinINSTALL, SMS, Etc 6. Initiate vulnerability scan to verify remediation Verify that vulnerability was indeed remediated 9. Close change request Or escalate if impacts are found 1 2 3 4 5 6 7 8 9

13


Download ppt "Mark Jones Senior Product Manager How Automation Can Help You: Use Cases for NetIQ Aegis™"

Similar presentations


Ads by Google