Incident management is crucial while dealing with major incidents. They are the crises that have widespread impacts on your employees, disrupt your operations, and impact your ability to deliver on customer expectations.
While you may assume your company is prepared for when a major incident strikes, assuming is not a strategy for success when the stakes are this high. You need knowledgeable and trained staff members who know what must be done. Equipped with the tools and resources to do their job effectively and efficiently.
Every company has a plan for major incident management. It may be as simple as “bring together a bunch of very smart people who can investigate and determine what is occurring” or it may be a sophisticated set of processes, decision structures, and protocols. If you don’t know your plan , then now is a good time to review your plan.
Once a major incident starts, you must focus on acting, not planning, before you lose control of events. Here are 5 ways that you can make your major incident management plan better and easier to execute:
- Understand your data and its accuracy.
Having data available to aid in diagnostics and troubleshooting is critical in a major incident situation. IT configuration data, support contacts, and up-to-date dependency data are essential.
You must understand what you have, not just what you think you have. The quality of your available data will determine how quickly you analyze the symptoms of the incident, identify the underlying cause, and determine actions required to restore service.
- Pre-assemble the infrastructure data picture.
IT ecosystems are complex. If your company uses many 3rd party and cloud services that involve suppliers, then putting the picture together may be more challenging.
Before the incident starts, assemble your infrastructure data to understand where there is confusion, locate the blind spots and where data may not be updated. When the time comes to put this data to use, you must be able to trust it.
- Capture periodic last-known-good configurations.
Your IT environment is constantly evolving with every new user, new device acquired, and piece of software updated or deployed. Change is one of the biggest causes of outages and major incidents.
Unfortunately, once a change or many changes are made and failures start to occur, it can be difficult to know the previous condition of the environment, so you have a target state for restoration.
Capturing last-known-good snapshots of your IT configuration data periodically is a helpful method for a compare and contrast of conditions during an incident to assess impacts and root-cause due to change.
- Determine your communication plan.
Major incident management relates to managing perceptions and providing confidence to users, management, and external stakeholders that the incident team controls the incident and is taking all necessary actions to restore service quickly.
Identifying impacted user groups, defining target audiences for incident communications, and preparing templates prior to an incident can significantly reduce the effort required to manage communications during the incident and reinforce the perception of control and organization .
- Update your IT asset and support contact information.
Maintaining up-to-date and accurate IT asset and support information ensures if a failure occurs, then you will know who to contact to help fix it.
People are hired and/or change positions, support vendors change, and assets are replaced and/or added to your IT environment. Monitoring these changes and maintaining up-to-date records can help avoid confusion in the middle of a major incident.
Major incidents will happen. You won’t know when they occur. You won’t know what will cause them. You can take steps today, however, to improve your major incident management plan.
For more details, view our webinar titled “Improve Incident Management: Be Ready for the 3am Call”.
Virima is here to help. Virima features can automatically discover and map your critical IT resources and the interconnections that link them to one another, your applications and services, and your users.
To get started, contact us today to schedule a demo and explore the possibilities!