The world of service management has changed as technology has shifted from providing tools for administrative support to being fully embedded in the delivery of the business’ core function. There’s a world of difference between using an accounts payable system to generate invoices and using a website or mobile application to engage customers and provide services to them. Getting a ride, booking a room, streaming audio and video entertainment requires a different mindset and sets the basis for what is ITSM in 2020, and beyond.
Preventing Service Unavailability
The keywords for what is ITSM in 2020 and beyond are prevention and innovation. The need to prevent issues from impacting public use of services means being able to prevent or address service-impacting issues instantly. The answer to this challenge is often found in automation. Service management platforms began embracing “big data” years ago, but organizations lacked the use cases that delivering publicly available services create.
Automation based on artificial intelligence (AI) relying on data available in a centralized platform is a crucial innovation that can prevent service outages and make them faster to diagnose and resolve. The key to success in this area is configuration management. A robust configuration management system or federated Configuration Management Data Bases (CDMB) is more critical than ever before as it provides the data hub that brings these capabilities together.
The graphic above shows the relationship between the data needed to automate issue detection and resolution and the CMDB:
- Monitoring systems, incidents, known errors, and change records provide a robust set of historical data and trigger automated routines that look for new issues and concerns.
- Using machine learning algorithms, automation can use the historical data set and the CMDB to perform more investigations into these triggers than an IT staff could ever monitor and address manually.
- If investigation uncovers an incident that has not yet impacted service, preventative steps can be taken automatically based on previous occurrences, while investigations that reveal a service impact can automatically log an incident and alert appropriate support teams.
- The CMDB is core to success in this process in
- Relationships between configuration items (CIs) aid in correlation (a long known fact).
- Identifying the affected service and potential impact is key to prioritizing the response: business-critical or public-facing services would engage major incident management processes, while lower priority services would generate an incident tended to via standard incident management processes.
- The CMDB also stores information about the current use of a configuration item, indicating whether it’s a production CI or not. In this way, automated remediation is not attempted in a pre-production environment where the incident was caused intentionally by development or testing activities.
- Service Mapping allows you to visualize the relationships within the CMDB. Leveraging various mapping views allows IT staff to quickly identify the infrastructure that is supporting enterprise services and focus support and maintenance efforts on the critical infrastructure.
The key takeaway in this example is to look at the current state CMDB before considering an automation project. Building or enhancing the present state CMDB using discovery and service mapping tools is a far better use of resources than implementing automation without the data that is key to driving success.
Similarly, a robust CMDB also lowers the risk of making changes, but only if there’s a way to utilize it. Here too, use of the same datastore and a different set of machine learning algorithms can predict the success of a change while it’s being recorded. Using automation, the change type can be automatically set when the change is entered and saved. Changes that historically offer little to no risk can be automatically classified as standard with peer or line manager approval only, while high-risk changes can drive a review and approval workflow that is more rigorous.
This is a different approach than many organizations take. Consider the amount of effort that goes into proposing standard changes and reviewing their ability to be considered standard changes going forward. Enabling historical risk to drive an instant-read on the change type eliminates the need to do this, and the likelihood that the humans involved in the decision can make errors in the decisioning, leading to more rigor than is needed or exposing the production environment to unintended consequences of a change.
The CMDB is a key component of this as it ties the data together:
- Enables algorithms to properly identify the business risk based on the criticality of service(s) that could be affected by the change, helping to drive the change type decision and rigor needed.
- Allows the use of historical data regarding changes made to the affected configuration items to be weighed along with the business criticality: even if a critical system is being affected, if the type of change being made has never impacted service due to use of automated deployment, it may still be considered a low-risk change.
- The CMDB can also be used to store risk indicators for proposed changes to the configuration item in the CI record. For example, applications that are managed through a DevOps process or continuous deployment can carry an indicator, causing any application change to be ruled a standard change.
Speeding Up Response
Another area worth considering is the use of machine learning to speed up incident management and support processes managed by IT staff. Typically, operations are continually monitoring alerts and reacting to them. Automation of all run-book operations typically performed manually enables the staff to pay more attention to decisions that cannot be made through automation. Automation will identify far more potential issues than an operations analyst can address, resolving many of them through pre-programmed operations or using resolution sets seen in event management tickets. This will, over time, increase the number of new, previously undetected issues that, if managed promptly, will continue to prevent service degradation. Thus, instead of managing routine issues, operations will be able to handle issues that fall into the category of “early warning” and not yet repetitive, driving far more proactive incident management.
There are two benefits to the CMDB fueling these activities: first, it increases the number of routine issues identified more quickly, and second, it enables faster, more educated response to the new issues detected. The CMDB is the first tool operators will use when investigating a new issue as it will tell them if the issue might have been caused by a recent change and also the history of issues experienced by the CI. Additionally, as mentioned previously, it will also enable automated prioritization of the alerts as they are generated, ensuring operators deal with the most critical issues first.
The second area of support that is improved through automation is service desk support. Today’s service management tools can automatically search knowledge and prior incidents, alerts, and changes, offering potential solutions to the analyst as they are logging the incident. This speeds up their ability to help resolve the issue and can result in an increase in first-tier resolution. For issues related to changes made recently, it can also speed up escalation, by letting the analyst know immediately that the issue could be related to last night’s change.
Finally, this level of automation can also make people aware of major incidents more quickly. Algorithms can help identify CI’s or business services which have had a number of incidents logged against them in a short period of time or those which have an alert logged for them already. This speeds up the escalation of the issue as a major incident in the event it slipped through the cracks of automated processes used for identification and, as already mentioned, will speed up the investigation of the cause to enable faster restoration of service.
These are just the tip of the iceberg of how automation and artificial intelligence will change the definition of what is ITSM in 2020, but these and other innovative uses for the data we’ve collected for years can only be effective with a robust CMDB. So, this is the year to dust off the CMDB project plan that’s been sitting on the shelf – IT will be business as usual until your CMDB is ready to support innovation.