What is Problem Management?
A problem is defined as the unknown cause of one or more incidents. The Problem Management process manages the lifecycle of all problems that users experience with a service. Your service might always work smoothly and users might not encounter problems. But it's more likely that users will at least occasionally experience an outage or an issue when using your service. Sometimes these challenges are user error, but sometimes there is a legitimate “problem” with the service.
The main objective of Problem Management is to prevent incidents from recurring in the future, or if they cannot be prevented, to ensure they can be resolved in the most expedient manner.
Why do I need Problem Management?
Problem management will help you:
- Identify underlying causes of incidents and seek to provide a permanent fix
- Minimize the impact of recurring incidents that cannot be prevented
How do I establish and support Problem Management for my services?
You do not need to do any additional work to establish Problem Management for your service. Problem records are created for each of the following conditions:
- ITOC creates a Problem record at the resolution of a Major Incident
- The UIT Service Desk creates a Problem record when it has documented a pattern of Incidents, and a deeper investigation seems to be warranted to get to the root cause
When a problem record is created, it is typically assigned to the Tier 3 or Tier 4 Incident response group lead, who then assigns it to the Subject Matter Expert (SME) with the most knowledge about the problem for resolution. The SME completes the Root Cause Analysis then Resolves the record. Once the record is Resolved, the Problem Management group is notified, then marks Problem Records “Closed” when the Root Cause Analysis is deemed sufficient.