The Essential Major Incident Process Demystified
top of page

Articles

The Major Incident Process

Updated: 6 days ago

The following summarises the key components of a major incident process under ITIL.


The following video gives an overview of the process and some key understandings around the process.


Major Incident Management Process Overview

The major incident process diagram
The Major Incident Process
Major Incident Process
.pptx
Download PPTX • 182KB

The following steps summarise the major incident process and can be downloaded in the file above which can be tailored to your own purposes.


1) Investigation


Objective

  • Swiftly identify the root cause of the incident and explore initial mitigation strategies.

Procedure

  • The designated receiving team is allocated a duration of one hour for the primary investigation.

  • In many circumstances, it's more efficient to promptly restart a particular component or service instead of in-depth diagnostics.

  • Should the initial investigation require external expertise or additional support, the Major Incident Manager (MI Mgr) may be consulted.


2) Contact the Major Incident Manager

Objective

  • Ensure coordinated and effective incident handling.

Procedure:

  • If the service disruption persists beyond one hour without a resolution, the incident owner is mandated to engage the MI Mgr.

  • The MI Mgr assumes responsibility for overseeing the recovery process and facilitating communications, even if there's an anticipation of imminent resolution.


3) Assess Criteria for Major Incident


Objective

  • Determine the gravity of the situation and decide on the course of action.

Procedure:

  • The MI Mgr evaluates whether the ongoing situation qualifies as a major incident based on predefined criteria.

  • This evaluation ensures that the MI process isn't initiated unnecessarily, preventing resource wastage.


4) Investigate & Escalate


Objective

  • Perform an in-depth analysis and involve higher tiers if needed.

Procedure:

  • The investigating team is given a predetermined window to delve deeper into the incident.

  • The primary focus remains on service recovery, with root cause analysis being a subsequent priority.

  • All significant findings and updates are meticulously recorded.


5) Manage Recovery & Comms

Objective

  • Restore normalcy and keep stakeholders informed.

Procedure

  • The MI Mgr holds the reins, supervising all efforts aimed at service restoration.

  • While the MI Mgr might seek external assistance, they remain the central figure guiding the overall recovery process.




6) Investigation Review Meetings


Objective

  • Facilitate effective team communication during the crisis.

Procedure

  • If the situation demands, the MI Mgr assembles the concerned teams for urgent review meetings.

  • These meetings are focused on framing the problem, prioritising actions, and assigning ownership to ensure swift resolution.


7) Update Stakeholders

Objective

  • Keep major stakeholders in the loop.

Procedure

  • The MI Mgr leads the communication efforts, updating stakeholders about the ongoing progress.

  • Updates are structured and provided in a consistent, standard format.


8) Communicate Resolution

Objective

  • Inform stakeholders once the service is restored.

Procedure

  • Post restoration, the MI Mgr disseminates information to all concerned parties, possibly enlisting support from the Help Desk.


9) Produce an MI Report

Objective

  • Document the incident and its handling for future reference.

Procedure

  • The MI Mgr drafts a comprehensive report capturing the incident's impact, significant events, follow-up actions, and, if discerned, the root cause.

  • This report is shared within 24 hours of incident closure. If the root cause remains elusive, the problem management process is triggered.



10) Close Incident


Objective

  • Close the incident record and complete the process

Procedure

  • The MI record is formally closed, recording the location of the MI report and any follow-up activities.


Major Incident Roles & Responsibilities

Role

Responsibilities

Help Desk Staff

• Responsible for identifying and logging incidents as they are reported by users.

• Capturing information which will help in the analysis of the issue.

• Providing updates to customers where requested.

• Escalate incidents to the appropriate technical teams or the Major Incident Manager as needed.

Investigating Technical Teams

• Collaborate with other technical teams or 3rd party suppliers as necessary to resolve incidents.

• Implement fixes, workarounds, or recovery actions to restore services.

• Update the incident management system with the incident resolution progress and status.

• Provide input to the Major Incident Manager on the incident status, impact, and expected resolution time.

• Participate in post-incident reviews to identify areas for improvement and implement corrective actions.

Major Incident Process

• Coordinate the overall major incident response and resolution process.

• Engage and mobilize necessary resources, including technical teams and 3rd party suppliers.

• Establish and maintain communication channels with stakeholders, including senior management and affected users.

• Ensure timely and accurate updates are provided to stakeholders.

• Monitor and track the progress of major incident resolution.

• Facilitate post-incident reviews to identify areas for improvement and implement corrective actions.

• Create Major Incident Report



About the author

Hi, I'm Alan, and have been working within the IT sector for over 30 years.

For the last 15 years, I've focused on IT Governance, Information Security, Projects and Service Management across various styles of organisations and markets.

I hold a degree in Information Systems, ITIL Expert certificate, PRINCE2 Practitioner and CISMP (Information Security Management).

More...

Iseo blue logo
bottom of page