BT CSO lifts lid on incident response planning at telecoms giant

CSO Steven Benton explains how BT’s incident response planning approach empowers teams to make the right decisions.

steven benton bt
BT

The widely shared understanding that it is a matter of when, not if an organisation will suffer a data breach underlines the importance of effective incident response planning. Having a plan in place and ready to perform in the event of an incident has become a key pillar of efficient organisational security strategy.

At British multinational telecommunications giant BT, CSO Steven Benton leads the team responsible for security operations across the organisation and protecting 30 million UK consumers, government, and critical national infrastructure from cyberthreats. Passionate about using BT’s unique view of global networks to push the boundaries of pro-active and predictive security, he is instrumental in defining and delivering a multi-layered and multi-year security investment strategy designed to keep pace with the rapidly changing threat landscape. Integral to this is the orchestration of sound response, and he recently spoke to CSO to shine a light on various aspects of BT’s incident response planning endeavours.

Why is it important for BT to have an incident response plan?

The saying, “If you fail to plan, you are planning to fail,” rings especially true for businesses today. Cyberattacks are unfortunately part and parcel of operating in today’s digital world. This means contingency planning must be the focal point in any CSO’s cybersecurity strategy.

Preparing in times of peace for the times of war is fundamental. The more organised you and your security team are in advance of these high-pressure situations, the better the outcome will be when the inevitable comes. Like military training, by exposing your team to a certain scenario when the stakes are less high, you can challenge individuals to manage their emotions and make informed decisions so that during a real-life situation they can spot, assess, and control an incident more confidently.

What are the stages of incident response for BT and what does each entail?

There are three elements or stages in our incident response at BT: react, enrich, and protect. The react phase is the first stage of incident response. When an event triggers an alert in our systems, our team starts off by triaging the situation; we need to get as much information about what it is, how it has happened, and how dangerous it might be in terms of service, customer, and reputational ramifications. Any one of these can be damaging for an organisation.

Depending on the severity, we will assemble team leaders from a number of functions within the business including technical, legal, data privacy, and press relations to form a security threat assessment group (STAG). The STAG will further investigate the impact of the incident on our customers, any data privacy implications and regulatory requirements, or any preparation to update the media. A significant part of this stage is figuring out the consequences of the required response to the event. For example, if we turn off a system to control an attack, how will this impact other services?

Simultaneously, as we gather artefacts, we enter the enrich stage where the threat intelligence unit conducts extensive technical research on the cause of the incident. Importantly, they will also look at the bigger picture of what has occurred, assessing whether the same event has occurred, or could occur, elsewhere in the network and whether it is part of a wider attack.

All this information will underpin our protect stage, where the security team does two things: puts a containment wrap around the epicentre of the incident to minimise its blast radius and uplifts protection and detection across the estate based on the attack indicators coming from the react and enrich teams.

These three stages happen concurrently as pace is crucial in incident response. It’s about taking a multi-dimensional approach that looks beyond the threat response room and at the wider impact on the business and our customers.

However, our response plan doesn’t stop there. It’s important to take time to conduct a post-mortem following the event. Here, we can contemplate the decisions made, the steps taken, and the outcome. We can think about what could have been done differently and, most importantly, what the key learnings are for next time. We are also interested to find out if there were alternative choices that we could have made for a better outcome in the situation. If there were, why didn’t we make them? Was it because we were missing an important piece of information? Was it because we had the right information, but we were not prepared emotionally to make the decision? Cybersecurity doesn’t end once the incident happens – it should be a continuous cycle of learnings and improvements.

How do emotional reactions contrast rational responses in incident response, and what impact can they have?

Naturally, in a stressful environment like a cybersecurity incident, there is always going to be some degree of emotional reaction, especially if the event is far reaching and severely impacts a business’s reputation or bottom line. We are only human after all.

In the past, there was a perception of security incident response as finger-pointing, war room-like situations. Team members would come with preconceptions that if a decision was wrong, they would be blamed, so they would be hesitant to make decisions on their own and would instead be preparing their defence. The impact of this meant potentially delaying the organisation’s response to incidents. Nowadays, our team recognises it isn’t a blame game.

Emotional responses at any of these stages removes the clarity and direction from your incident response. Having a plan, like in any critical situation in life, is a solution to this. It’s also about creating an environment where people feel supported and can put their emotions aside.

How do you ensure rational responses in each phase of incident response?

To begin with, you need to accept that you are operating in the unknown. As we establish new workplace norms, it’s important to adopt security strategies and responses based on uncertainty. Perfect decisions in incident response situations are almost impossible, so don’t wait until you have everything, or it’ll more than likely be too late. Make the best decision you can based on the information you have.

Second, security is a team effort. It’s really important to set your experts up for success, especially when incident response can be such a stress-inducing environment. For each of BT’s incident response phases, we deliberately divide up our teams into groups responsible for those phases to ensure everyone is comfortable, empowered, and can be laser-focused on the task at hand.

We’ve also prepared playbooks for different types of incidents that security teams can continually come back to and we use these to onboard new members too to ensure they know best practices and are comfortable with their responsibilities from the outset. This is a really important part of ensuring your team operates with rational, rather than emotional responses.

In addition, we also have a strict protocol on communication that needs to be observed at all times during the response. A one source of truth managed by an incident manager creates a calm and ordered environment, encouraging new and existing members to put their best effort in.

New technologies such as machine learning and artificial intelligence can also transform incident response through not just real-time detection of issues, but also intelligent, automated responses. Automation can assist in understanding data, implementing updates, and patching to protect networks faster than an attack spreads, as well as predicting behaviours which can help security teams react to new threats faster. Automation, as a consequence, can reduce security teams’ stress.

Copyright © 2021 IDG Communications, Inc.

Microsoft's very bad year for security: A timeline