top of page

Introduction to the Problem Management Process

Updated: Apr 26

An Introduction To Problem Management Process


What is Problem Management?

Problem management focuses on identifying, analysing, and resolving the root causes of recurring IT issues.


A problem is a recurring or underlying cause of one or more incidents, often resulting in disruptions or degradation of IT systems, applications, or infrastructure. So, you might recover from an incident by, for example, rebooting a laptop, which resolves the issue, but it may not fix the root cause, and the issue could reoccur.



people identifying problems

In contrast to incident management, which aims to restore normal service quickly, problem management seeks to prevent incidents from reoccurring or minimise their impact on IT services.


The problem management process involves identifying problems, prioritising them based on impact and urgency, conducting root cause analysis, implementing solutions to prevent a recurrence, and documenting problem resolution efforts.


Problem management aims to enhance overall IT service quality by reducing the frequency and impact of recurring incidents.


If you don't use the term "problem management," consider it as "root cause analysis" or "issue resolution."


If you don't get it, that's ok. While they are similar in process and approach, incident and problem management are actually quite different. We'll explore that in a moment. But first, here's a video on what problem management is, and how the process normally works.


The Problem Management Process Explained



The problem management process diagram

The Process Steps Summarised

Step

Description

Summary

1

Problem Identification

The identification of problems can come through two main methods;

  • Proactive problem identification: Analyzing incident data to detect patterns, trends, and potential problems.

  • Reactive problem identification: Recognising recurring incidents or major incidents reported by users or technical staff

2

Problem Logging

The problem is logged with all relevant information, including links to any incidents.

3

Categorisation & Prioritisation

The problem is assigned to a specific category based on its nature and type, facilitating better tracking, reporting, and analysis. The problem is then assessed in terms of impact and urgency to determine its priority, which guides the allocation of resources and timelines for resolution.

4

Investigation & Diagnosis

​Investigation and diagnosis by technical staff will result in two potential outputs;

  • Root cause analysis: Investigating the underlying causes

  • Workarounds: Identifying temporary solutions or workarounds to minimise impact while a permanent fix is being developed.

5

Resolution

Design, test, and implement a permanent solution or change to address the root cause and prevent the recurrence of the problem.

6

Update Knowledge Base

Documenting the problem, its root cause, and the resolution as a “known error” to facilitate faster future resolution of similar incidents or problems.

7

Close Problem & Incidents

Verifying that the problem is resolved, updating the problem record with the resolution details, and formally closing the problem and any linked incidents.

Problem Management Roles & Responsibilities

Role

Responsibilities

Help Desk Analysts

  • Responsible for identifying and logging problems as they are detected from recurring incidents or trends.

  • Categorising problems based on impact, urgency, and root cause.

  • Capturing information to aid in problem analysis and resolution.

  • Providing updates to customers as needed or when requested.

  • Escalating problems to appropriate technical teams or Problem Managers as required.

Help Desk Manager / Team Leader

  • Managing the problem management process.

  • Coordinating responses to identified problems.

  • Assigning resources and prioritising tasks related to problem resolution.

  • Monitoring problem progress and keeping stakeholders informed.

  • Ensuring problems are logged, categorised, and resolved according to established procedures.

  • Conducting post-problem reviews and implementing improvements.

  • Reporting on metrics and analysis of problem trends.

Technical Support Staff

  • Collaborating with other technical teams or 3rd party suppliers to diagnose and resolve problems.

  • Conducting root cause analysis, identifying workarounds or implementing permanent fixes.

  • Updating the problem management system with the problem resolution progress and status.

  • Providing input to the Helpdesk Manager on problem status, impact, and expected resolution time.

  • Participating in post-problem reviews to identify areas for improvement and implementing corrective actions.

Problem Manager

  • Overseeing the entire problem management process and ensuring its effectiveness.

  • Coordinating with technical support staff and other stakeholders for problem diagnosis and resolution.

  • Ensuring root cause analysis is conducted and that permanent fixes are implemented.

  • Maintaining the Known Error Database (KEDB) and ensuring it is updated with relevant information.

  • Monitoring problem trends, analysing data, and recommending preventive measures.

  • Reviewing problem management metrics, reporting on process performance, and driving continuous improvement initiatives.



The Differences Between Incident & Problem Manaagement


In the world of IT service management (ITSM), both incident management and problem management play critical roles in maintaining smooth operations and ensuring the delivery of high-quality IT services. While they share some similarities and are often interrelated, it is important to understand the key differences between these two processes.


So, let's explore the main distinctions between incident management and problem management to help you optimize your ITSM strategy.


Definition and Purpose

Incident Management: Incident management focuses on the prompt resolution of incidents or disruptions affecting IT services. An incident can be any event that negatively impacts the normal operation of an IT system, application, or infrastructure. The primary goal of incident management is to minimize the impact of incidents on business operations and restore normal service as quickly as possible.


Problem Management: Problem management is a proactive process that aims to identify, analyze, and resolve the root causes of recurring incidents. A problem is the underlying cause of one or more incidents. The main objective of problem management is to prevent incidents from reoccurring or reduce their impact on IT services by addressing the root causes.


Scope and Time Frame

Incident Management: The scope of incident management is typically limited to addressing the symptoms of an issue, rather than identifying and resolving the underlying cause. Incident management emphasizes short-term solutions that help restore normal service operation as quickly as possible.


Problem Management: Problem management focuses on long-term solutions by identifying and addressing the root causes of recurring incidents. This process can involve in-depth analysis, testing, and implementation of permanent fixes, which may take longer compared to incident resolution.


Approach and Techniques

Incident Management: Incident management often employs a reactive approach, as it responds to incidents as they occur. Key techniques used in incident management include incident prioritization, categorization, and the application of predefined resolution procedures or workarounds.


Problem Management: Problem management takes a more proactive approach, analyzing trends and patterns to identify potential problems before they cause incidents. Root cause analysis techniques, such as the 5 Whys, Ishikawa diagrams, and fault tree analysis, are commonly used in problem management to identify the underlying causes of problems.


Relationship with Other ITSM Processes

Incident Management: Incident management is closely related to service request management, as both processes involve the handling of user-reported issues. Additionally, incident management often interfaces with change management, as changes may be required to resolve incidents.


Problem Management: Problem management is tightly linked to knowledge management, as it relies on the documentation and sharing of known errors and their resolutions. Problem management also interacts with change management when implementing permanent fixes to address root causes.


Categorising Problems

While problem management and incident management are related processes, they have different purposes and distinct objectives; Incident management focuses on quickly restoring regular service operation after an interruption, while problem management aims to identify, analyse, and resolve the root causes of recurring incidents.


As a result, the categories used in problem management can be more granular and focused on root causes. In contrast, incident management categories are usually centred around the types of incidents or the affected services.


While problem management categories don't need to be identical to incident management categories, they should be related and complementary to ensure consistency and facilitate effective communication and collaboration between the two processes.


Here are some guidelines to consider when defining problem management categories:

  1. Align problem categories with incident categories wherever possible to maintain consistency and ease of correlation between incidents and problems.

  2. Focus on the root causes and underlying issues in problem management categories rather than the symptoms or manifestations of the incidents.

  3. Consider creating subcategories within problem management categories to provide additional granularity and aid in identifying trends or patterns in root causes.

By tailoring your problem management categories to reflect the root causes and underlying issues, you'll be better equipped to address these problems and improve your overall IT service quality.


Problem Management Maturity Model



Level

​Maturity

Key Indicators

1

Ad-hoc

  • No formal problem management process is in place.

  • Reactive response to problems.

  • Reliance on individual efforts and experience.

2

Basic

  • Basic documentation of problem management procedures.

  • Limited problem analysis and prioritisation.

  • Inconsistent use of tools and processes.

  • Escalation paths are not clearly defined.

3

Structured

  • Well-defined problem management procedures.

  • Clear roles and responsibilities.

  • Standardised problem analysis, prioritisation, and escalation.

  • Improved collaboration and communication.

4

Managed

  • Proactive problem management approach.

  • Continuous improvement processes in place.

  • Regular reviews and audits of problem management.

  • Established performance metrics and KPIs.

  • Focus on root cause analysis and prevention.

5

Optimised

  • Fully integrated and optimised problem management.

  • Advanced analytics and automation.

  • Problem anticipation and prevention.

  • Continuous improvement is a core value.

  • Alignment with IT and business goals.


Comments


bottom of page