Presentation is loading. Please wait.

Presentation is loading. Please wait.

What’s the problem with Problem Management?

Similar presentations


Presentation on theme: "What’s the problem with Problem Management?"— Presentation transcript:

1 What’s the problem with Problem Management?
Karen Thorpe Business Connexion Why do we need effective governance in IT and in our businesses….. we all know that in IT, SHIFT happens, at least twice on your keyboard! Ask the audience, what was the traffic like on the way in to the conference this morning, any problems on the roads? Were those problems or incidents?

2 Reactive Problem Management
What’s the problem with Problem Management? Reactive Problem Management Difference between an Incident and a Problem. How many Incidents do you have? How many Problems?

3 What’s the problem with Problem Management?
Incident Problem Management provides the solution or workaround to assist in the restoration of Services, and the Root Cause Analysis of Problems. Together with recommendations to fix. Problem Difference between an Incident and a Problem. How many Incidents do you have? How many Problems?

4 Toolkit Service Desk Tool Procedures/Work instructions KEDB
What’s the problem with Problem Management? Toolkit Service Desk Tool Procedures/Work instructions KEDB Known Errors Workarounds Service Desk tool – whether you go for the top of the range service now or ITSM or for a ‘free’ option such as Spice works, there is no need to manage the desk by means of an excel spread sheet or an access database.. How do you restore a service, or enable the user to work again? Do you have procedures and or work instructions – provided by the SLM and Support teams What about a KEDB? – Managed by PM Do you have workarounds? but content is provided by the support teams Are your workarounds linked to problem records which are linked to incident records? There needs to be a strong link between Incident and Problem Management process. If not, the desk won’t be aware of workarounds and known errors in order to efficiently manage incidents, and

5 What’s the problem with Problem Management
Known Error Database Do you have a KEDB? Is it contained within your SM Tool? Is usage audited? Is the effective usage of your KEDB an indication of the health of your environment? The way information is logged about the Incident and historical information is critical to the success of the Problem Management process. If common incidents are logged in many different ways, there will be little chance of identifying recurring incidents and proactive identification of problems. There needs to be consistency. How effective and efficient can the respective Service Desk & the Incident Management process be without Problem Management? Is there a KEDB maintained and supported by SMEs? Does the Service Desk make use of the KEDB? Is that KEDB of quality? – let me ask you the audience a question. Do you have a KEDB that is contained within your SD SM Toolset. Is it readily available to your SD and Incident Management? How well utilised is it? If we don’t have a well managed KEDB supporting the desk, then we are less likely to be able to support the levels of availability which are required on a consistent basis. Our services then run the risk of appearing to be unreliable. What about the damage to our brand? Whether you run an insourced IT organisation or are an outsource partner, your brand is important. So let me ask you this question: The KEDB provides Incident Management and the desk with the ability to resolve incidents efficiently. To keep the services running, ensuring that you achieve SLA. But if your KEDB is well used, is this an indication of a healthy infrastructure?

6 What’s the problem with Problem Management?
Root Cause Analysis Facilitation of a Root Cause Analysis by Problem Management Documented Root Cause of recurring Incidents, Major Incidents & Problems Documented solutions / options Recommendations Acceptance by the customer Facilitated by PM. what level of skill is required. Who participates? PM OR PA requires a good technical understanding. SMEs in a room or in a virtual team from both the ops and the EA/SA space Using methods such as Kepner and Fourie, isikawar, pain value analysis, brainstorming etc…. Whichever method your organisation has selected as the recommended method and supported by your Problem Management process and policies. The reason for selecting a method is to ensure that all investigations are performed in a consistent manner. Documented – that word documented is critical. How many of you have an SLA to produce a fully documented RCA? Is it reasonable to have an SLA of 5 days for example, to produce a fully documented RCA – realistically is it possible? A fully documented RCA of quality takes time. Surely the SLA should be on restoring the service rather than on the documenting of an RCA within a short period of time? Are our RCAs being accepted by our customer, the business? Sign off and acceptance by the customer is critical. How many of you don’t get sign off? What we should be producing is a Major Incident report, Service Failure Report within an agreed target period. Then we should be confirming with our subject matter experts both internal and external as to the root cause with sign off of Root Cause. Then we should be identifying a solution in conjunction with the SMEs, the EAs and the SAs to ensure that our recommended solution ties back to the business and it strategy. If not, we are not providing business focused problem management but instead may be recommending a quick fix to get the customer off the Operations team back.

7 Reactive Problem Management Outputs
What’s the problem with Problem Management? Reactive Problem Management Outputs Known Errors Workarounds Root Cause Analysis Requests for change Ultimately Reactive Problem Management manages: Problem Control Error Control PM Process Manager is Accountable for the quality of the outputs and yet often, he or she doesn’t have the technical skill to confirm that what has been documented is correct. Known errors and workarounds -  Are we able to support the levels of availability which are required on a consistent basis. Are they unreliable? What about the damage to our brand? Whether you run an insourced IT organisation or are an outsource partner, your brand is important therefor the quality of the kedb and its’ contents are important to protect the organisation as well as to protect our brand whether we are an internal IT organisation or an outsource partner. There needs to be a strong link between Incident and Problem Management process.

8 Measuring Reactive Problem Management
What’s the problem with Problem Management? Measuring Reactive Problem Management Is your reactive Problem Management process adding value? Is it Business Focused? Or has Reactive Problem Management become an extended Incident Management Process How do we measure reactive problem management – key metrics should see a reduction in the number of incidents logged A reduction in the number of recurring incidents An improvement in availability of services, specifically those which are mission critical No of RCAs accepted and approved for Change by the business* It would show alignment to business strategy and IT strategy It would show integration with risk management Does anyone here measure the no of RCAs accepted? And changes implemented as a result of those RCAs? Think about why this would be a good metric for RPM. Why does business not accept the RCA recommendations? Is it because they don’t understand – if not, then whose responsibility is it to ensure that? Is it a financial consideration? Or is it because our RCA has not been considered in line with the business and it strategy? Are we missing the plot somewhere? Are we providing ROI information on our recommendations? Is it business focused or only ops focused? If it’s the latter, has PM become an extension of the Incident Management process?

9 The other ‘Face’ of Problem Management
Proactive Problem Management How many of you are performing proactive problem management? How many of you have found it difficult to sell the concept of Proactive Problem Management within your organisation? Is this because the reactive process is not effective / in place? Proactive Problem Management – where does it belong? Is it an Ops process or does it belong in CSI? Park that thought.

10 Proactive Problem Management
What’s the problem with Problem Management? Proactive Problem Management Reliant on trending of Incident & Event Management data Reliant on service owners and managers proactively identifying a Problem, where the reporting data is inadequate or trending is not occurring Reviewing of changes - attendance at the CAB? Who? We underestimate the value of Problem Management. Totally reliant on data analysis of Incident and Event Management and on reactive problem management data, – trending therefore is heavily reliant on the right Incident and reactive Problem Management metrics being in place as well as effective Event Metrics. So if you’re not doing reactive properly, and not managing Incidents effectively, it is going to be challenging. Are we trending on the right things? From a trending perspective have the monitoring & alerting parameters been confirmed by capacity management and provided to event management? Are those parameters reviewed, updated and communicated on a regular basis to other process management. Attendance at the CAB to review the changes There’s always a debate about who should be trending the data, Incident management or Problem Management or the Service Owners/ Managers. I don’t believe there is a definitive answer but if no one is trending and providing the information to Problem Management, ultimately it is PM’s accountability and responsibility. So the PM should be engaging with the relevant process managers, service managers/owners and reporting tools to identify trends and what should be looking for. There needs to be strong integration between Incident and Problem Management. A word of caution though – it is much easier to trend on technology components and end users than on services so make sure that whoever is analysing the data has a good understanding of the services critical to the business, the components which make up those services and the users who make use of those services in order that you can effectively trend across the appropriate criteria. Where there is limited or no data, reliance on the service managers or owners identifying recurring incidents and formally logging proactive problems. Often a change is logged without a formal Problem Investigation – therefore we miss the opportunity to ensure that the change which has been recommended meets the strategic and tactical requirements of the organisation. How do we address this? By reviewing changes and asking for an RCA when we believe that it is required. Often too we will meet with an attitude of ‘íf it’s not broken, don’t fix it. So how do we persuade the business to change it’s attitude. What about ensuring that we know what the organisation’s governance obligations are? What the risks of not implementing our recommendation would be?

11 What’s the problem with Problem Management?
Proactive Problem Management Resources require the following skills competencies: Familiarity with or should have access to: Strategic and Tactical plans of the organisation Enterprise and Solution Architecture Keep up to date with new technology and innovation Must be able to relate ROI of Proactive Problem Management Governance requirements / obligations & must be a Strategic thinker Proactive PM skill requirements – a different skill set from reactive or operational pm skills Knowledge of: Strategic and Tactical plans of the org EA & SA – Technical literacy Strategic thinker Familiar with Governance Requirements ROI – be able to relate back to the business by means of ROI. Often these skills are not available in the ops space, or if they are, with an ineffective PM process in place, there is so much fire fighting that no one thinks long term because of the pressure to restore service and document a RCA within a few days. So your proactive pm may be someone who is a senior business focused manager rather than a technical manager, perhaps someone in CSI. If you have multiple customers as the big outsource partners are - you may have a pm per customer who will have to have a good understanding of the same skills for that customer or customer base, you may have someone in a CSI role who can fulfil this role as part of that role.

12 Proactive Problem Management – Where does it belong?
What’s the problem with Problem Management? Proactive Problem Management – Where does it belong? Service Operations, or Continual Service Improvement? There is great debate about where in the service management lifecycle does Proactive Problem Management belong. Because after all – isn’t CSI focussing on service improvements? Isn’t Problem Management key to that? Have we underestimated the value of the Problem Management process? If you look at Cobit 5 and it’s goal for problem management, that goal is all about continual improvement. And given the earlier discussion we’ve had about reactive problem management and the fact that it may be perceived as an extended incident management process, maybe proactive pm should be a CSI responsibility? Business today wants to see return on investment. But in many IT organisations, often the people at the coal face haven’t had sight of the IT strategy therefore they make recommendations based on the here and now, not on the where we want to be. CSI is all about where do we want to be in relation to where we are now and can effectively join up the dots between ops and the strategic and tactical requirements of the business. In large organisations, Reactive problem management are often so busy with operational PM that they don’t see the trends. Often the information is not there, especially if our reactive Problem Management process has become an extension of the Incident Management process. Then there is little or no formal proactive problem management.

13 Why Problem Management?
What’s the problem with Problem Management? Why Problem Management? Enterprise Governance COBIT is the business framework for enterprise IT management and governance ; COBIT5 Cobit5 has only one goal for Problem Management and that goal is focused on Continual Improvement. IT-related problems are resolved so that they do not reoccur Risk Management Continual Service Improvement Requirements for PLC to have Enterprise Governance in place Introduction of Governance frameworks COBIT is the business framework for enterprise IT management and governance – latest version is Cobit5 Business has an obligation to manage IT risks as seriously as business risks – Business has become so dependent on IT that it has to manage IT risk to protect the business. The goal of problem management is focused on continual improvement, that IT related problems are resolved so that they don’t reoccur. Risk Management – Continual service improvement So what about Proactive Problem Management – where does it belong? Is it an Ops process or does it belong in CSI?.

14 Effective Problem Management Metrics
What’s the problem with Problem Management? Effective Problem Management Metrics Cobit5 has specified 5 metrics around the goal Decrease in number of recurring incidents caused by unresolved problems % of major incidents for which problems were logged % of workarounds defined for open problems % of problems logged as proactive No of problems for which a satisfactory resolution addressing root causes was found 5 metrics to support the goal have been identified Cobit tells you what ITIL is the how These are business focussed.

15 What’s the problem with Problem Management?


Download ppt "What’s the problem with Problem Management?"

Similar presentations


Ads by Google