Presentation on theme: "Change Management Practitioners Forum Thursday Nov 29, 2012 Jon Dowell Shawn McKenzie."— Presentation transcript:
Change Management Practitioners Forum Thursday Nov 29, 2012 Jon Dowell Shawn McKenzie
Agenda Housekeeping Introductions Assessing change risk “Is that really the worst that can happen?” Responsiveness versus control "How quick is too quick?" Mitigating change risk "Just say NO?" Next Steps
Housekeeping Welcome to Gibsons Energy Fire Alarms & Washrooms Katharina Stephens
Introductions Name, Company, & Experience Jon Dowell
Facilitators Jon Dowell Senior Consultant with KSLD Consulting. 15 years of experience solving I.T. mysteries. Facilitation and critical thinking during: o Major Incidents o Problem investigations o Project quality assessments prior to go-live o Project warranty periods Training and mentoring o Critical thinking o Root cause analysis o Impact assessments o Potential risks associated with requests for change KSLD Consulting specializes in I.T. Problem Management and problem solving for today’s busy world.
Facilitators Shawn McKenzie Founding partner of SignalFire Inc, an independent ITSM and Organizational Change Management consulting practice Over 12 years experience implementing IT, OCM and Process Optimization solutions for clients in: o Oil & Gas o Telecommunications o Government o Mining o Utilities Certified ITIL ver.2 Master and ver.3 Expert Prosci Change Management certified- Sorrento, Italy Cobit Practitioner certified Focus on process before tools and technology ITIL Foundations instructor for over 250 students with 99% pass rate itSMF board member for the last 5 years
Objectives of the Practitioner Forum To facilitate sharing of information and experiences between like minded practitioners To provide an opportunity for networking To grow the level of knowledge of participants
How we operate We try to meet quarterly as a group The group drives future agenda based on interest levels We respect the difference in our level of experiences We participate and share freely.
Assessing change risk “Is that really the worst that can happen?” Jon Dowell
Typical Answers… “Is that really the worst that can happen?” Server may not reboot clean Require hard boot Software installation may not work on first try Reboot & retry reinstall Users may not log off at start of change window Force disconnect users
Typical Answers… “Is that really the worst that can happen?” Entire network could go down in a cascade failure Millions lost!!! Server suffer major hardware failure during reboot and never restart Millions lost!!! Ice storm takes out all our power & data centres for months Millinos lost!!!
Real Answers… “Is that really the worst that can happen?” Human error (fat fingered change) Second set of eyes (buddy system) Pre-build script in copy/paste All related assets not identified Minimal number of services impacted Have teams on stand-by Rollback required (restore to pre-change stage) Time to restore
True Risk Understanding… “Is that really the worst that can happen?” “What could happen?” Key Steps 1.Identify Potential Problems 2.Identify Likely Causes 3.Take Preventative Action 4.Plan Contingent Actions Kepner Tregoe – Potential Problem Analysis
What could happen… Potential ProblemsAssess Threat (Probability / Severity) Likely CausesPreventative Action Contingent Action State Action… (Change Short Description) Kepner Tregoe – Potential Problem Analysis
What could happen… Potential ProblemsAssess Threat (Probability / Severity) Likely CausesPreventative Action Contingent Action State Action… Network Core Switch Replacement Kepner Tregoe – Potential Problem Analysis Users may remain logged in during replacement Entire network cascade failure New switch does not work Applications do not work properly after change HIGH / LOW MED / MED MED / HIGH V. LOW / HIGH
What could happen… Potential ProblemsAssess Threat (Probability / Severity) Likely CausesPreventative Action Contingent Action New switch does not work MED / HIGH Applications do not work properly after change MED / MED State Action… Network Core Switch Replacement Kepner Tregoe – Potential Problem Analysis No power bar outlets preventing start up Miss-configured OS causing switch to not communicate Network loop causing infinite loop / broadcast storm Port miss match Routing incorrect Response too quick for application to process Check rack prior to start of change Test in lab. Copy config to text file and past into production Second set of eyes during cabling Second set of eyes. Test in lab. Copy/paste config Test in lab Second set of eyes. Test in lab. Copy/paste config Roll back to old switch by re- installing into rack, re-cabling, & turning on. (1 hr) Use DRP application service to complete critical work (8 hrs)
Responsiveness versus control “How quick is too quick?” Shawn McKenzie
Change Management is Schizophrenic * Balance of control and responsive To manage and control changes to the live environment using standard methods and process... while minimizing the risk and impact of change related issues To efficiently and promptly handle changes...
Building consensus on The Balance How can we communicate the importance of risk control without creating a fear of approval 'Red Tape"? Responsiveness to Business need is important, but how would a business-hours Change-related outage affect Operations? Raising the Spectre of a Black Monday
Building consensus on The Balance Would having Change Implementations 6 days a week during business hours create too much uncertainty? How would going to planned weekly deployment windows change the culture of IT Operations? How about every other week? Do Change Risk assessments imply Tiers of Risk/Reward?
Next Steps Future Sessions - Change January 2013 April 2013 Oct 2013 (Align with conference?) Future Sessions - Problem February 2013 May 2013 Sept 2013 November 2013 (Align with conference?) Future Sessions - ??? March 2013 Future Change Practitioner Topics? Process Review and challenges along the way… Profile of a Change Manager – what makes a good change manager Supporting Tools? (ITSM Suite, CMDB) Other???
itSMF Upcoming Events! Problem Management Practitioner Event Jorge Wong & Harry Contos Thursday, Dec 6 Gibsons Energy
Real Problems Lack of co-ordination between Change and Configuration management Inadequate risk and impact assessment undertaken Inaccurate / missing configuration information in CMDB so risk of wrong decisions Incomplete scope of assessment (security, business impact, availability, capacity, continuity) Staff complacency – manipulate the system Urgent changes not be appropriately tested. Process seen as bureaucratic and burdensome Instituting process control over contractor support personnel or specific segments / applications Lack of tools to efficiently track changes Scope too wide for resources available to handle