• United States




Rebuilding after NotPetya: How Maersk moved forward

Oct 09, 201915 mins
CyberattacksIT GovernanceRansomware

In the wake of NotPetya attacks, Maersk’s IT and security teams embraced transparency, greater collaboration with business, and a risk-based approach.

Maersk container ship / shipping containers / abstract data
Credit: Jorgen Norgaard / WhatAWin / Getty Images

Few cyber incidents are as well-known as the NotPetya attack in 2017. The attack crippled a number of companies, none more publicly than shipping giant Maersk, which temporarily lost its entire global operations.

Speaking at the 2019 Gartner Risk Summit in London, Adam Banks, Maersk’s chief technology and information officer, and Maersk’s CISO, Andy Powell, gave an account of the day NotPetya struck the company. They detailed how they responded to and recovered from the incident, and what the company learned.

How NotPetya affected Maersk

Maersk, a major global shipping and logistics company has 76 ports, around 900 ships, approximately 4 million containers, and around 1,000 warehouses. A large vessel carries 22,000 containers, each equivalent to an articulated lorry. “This is a data-centric business,” said Banks. “If you think about the way data is used in this sort of business, unlike financial services, you can lock it up, you can’t create a centralized data pool and put every form of defence around it.”

“This industry did a fantastic job in the 60s and 70s of standardizing containers. In my opinion, that made one major mistake: Every single one of those containers looks the same,” said Banks. “There is no way of telling what’s inside each box. It’s been sealed by customs, so someone’s got to work out which ship it goes on and what’s in it, and all of that is done with data.” Each container, he added, requires roughly 300 pages of information for customs and support and import/export documentation.

The fact that Maersk is such a data-centric business impacted the company heavily when NotPetya struck. The attack itself has been well-documented: In an attempt to disrupt Ukraine, state-sponsored actors corrupted an update of a Ukrainian tax preparation application called M.E.Doc with NotPetya, a wiper malware that masquerades as ransomware. That corrupted update infected companies globally including FedEx, pharma company Merck & Co, food giant Mondelez, and construction firm Saint-Gobain.

“Maersk was not unusually weak, there was no flaw in what we were doing when that happened,” said Banks.

“Like most tax software, [M.E.Doc] is frequently updated–tax rules, tax changes. So, the majority are set up to auto-update, and it auto-updates roughly every eight to ten weeks. From a security perspective, something normal is happening: A server on your estate is calling out to a supplier, checking if it’s got the latest version of the software, downloading it. No alarms are going off in your antivirus software; there’s nothing you’re going to pick up around that activity. That’s basically how it got onto our network,” he adds.

While NotPetya used the NSA EternalBlue exploit, Banks said the company had been 100% patched against it for about three months when this hit. It was undone by the malware’s ability to steal credentials. “Unfortunately for us that server was going to be migrated to the cloud the following weekend. The Sunday before the Tuesday this hit us, one of our domain admins had logged onto that box, done an inventory of what was on that box, and logged off. So, one of the first things stolen was a valid domain admin credential. That allowed the malware to move horizontally and vertically in the system, infecting approximately 55,000 devices within seven minutes.”

At around 10 a.m. on the day of the attack, Banks was at a corporate photo shoot when he received repeated calls from the team monitoring the company’s global network. “What they could see was parts of the network were starting to get quiet, illogically quiet,” he said. “Whilst they were on the phone, the monitoring center based in the UK lost visibility to anything. That’s the point we decided we were just going to switch everything off.”

Any devices not shut down at the time of the infection were lost; Maersk lost some 49,000 laptops and computer devices. All 1,200 of the critical business applications the company uses were rendered inaccessible. While some of those were mainframes or other pieces of hardware running Linux that were technically still running, they had no network to connect to.

Maersk’s network was completely wiped out. It thought every single one of its 147 Active Directory instances had ceased to exist. Any phone numbers not written down were lost as phone contact lists were synced with Microsoft Outlook, which was wiped after the phones weren’t able to reach the network. The company had lost the ability to communicate globally.

At this point, Maersk had no information about what was going on. Was this a targeted or widespread attack, an accident, a failure, or a mixture? The shut-down processes took around seven hours. During this time, Banks ensured he had all the phone numbers of the executive team written down on paper. He then phoned the CEO, who was just about to get on a plane, and told him that something major was happening.

By around 5 p.m. that day, the company was still unsure what had happened, but through contact with suppliers and partners realized this was a major global issue and not simply a targeted attack against Maersk.

What Maersk did next

By the morning of the next day Microsoft had come back to Banks with some good news and bad news. “It was their CSO who said, ‘Well, the good news is we’ve been able to decrypt one [device]. It took 22,000 computing hours to crack that. The bad news is that code only unlocks one device,’” he said. “In one office, there were about 1,000 machines, of which three were working. So, on a network of 60,000 devices roughly, multiply that number by 22,000 hours. Clearly, decrypting was not going to be plan A.”

As decryption wasn’t a realistic route, the next option was to try to recover lost data. Banks, however, also decided against that and to instead start again from scratch. ”If you’ve been hit to the extent that we had, which was 99% infection, why would you restore the thing that had just been destroyed? The threat you’re facing is so robust it will destroy the same thing again and again and again,” he said. “So, we took the decision to rebuild from scratch rather than restore anything.”

“The CEO phoned me and said ‘Are you mad? Do you seriously think you can rebuild an entire company that’s got 30 years of technology history and 114 years of corporate history in a matter of weeks?’ To which my reply was, ‘Well, what else we’re going to do? Either we sit here and wait eight to ten days for all the vendors to produce some form of patch or piece of antivirus software that can track this, or we start trying to do something,” said Banks. “If we get eight to ten days in and we’re making no progress and they’ve got protection on the market, we’ll switch back.’”

“But it was literally that choice; we sit still, or we do something. So, we chose to do something.”

Some good fortune came Maersk’s way in the form of a power cut in Lagos, Nigeria, which meant the local office was offline at the time of the attack. A full, unencrypted copy of Active Directory was available. “The 23-year-old local IT support guy got a free trip on a Gulfstream G450, physically carrying the hard drive that we used as the yeast that built the rest of the network,” said Banks.

By around day 14, Maersk had its basic business technology back. While it was operating at reduced volumes, the company was up and running. To get back to full scale took another four weeks due to the difficulty of acquiring 17,000 new end-user devices.

“The question I get asked most is what did you do? What I wanted to do was go home.” Instead Banks stayed at the company’s offices for the next 70 days straight directing the recovery. 

Be open and lean on partners, customers and suppliers

To supplement its small forensics team, the company quickly selected Deloitte as its partner to help them through this. On day one Maersk had five of Deloitte’s forensic team on site. By day four it had 180. “I would love to say [we chose them] after an extensive and deep piece of analysis, but we chose them because they were the nearest physically to the location I wanted to run recovery from,” said Banks. “I chose the UK, and Deloitte have a significant cyber center in the UK.”

The company also relied on its technology partners to help distribute data to local offices. “On days four to nine, we created a new build for clients and servers, and I wanted to get that new build for clients out there as quick as possible. [But] I’ve got to get it to 599 sites, and I’m currently propagating an Active Directory across the global network,” said Banks. “Even having the bandwidth, it was going to take nine days to rebuild the Active Directory server.”

“So, we asked our partners, Deloitte, IBM and Microsoft and said, ‘We’ve got some files that we would like to get globally distributed. Despite being the most infected company in the world, please will you transmit them over your network for us?’” he said. “Those companies agreed to transmit the build data to the local offices nearest Maersk sites, which were then picked up by local employees.” To let the technologists focus on rebuilding servers, sales and marketing teams were tasked with rebuilding the laptops.

Transparency was also a key part of Maersk’s response, and something it has been praised for in the aftermath. That also helped generate enough goodwill to get others to aid with its recovery efforts. “In other companies where I’ve been through similar type of events, all the salespeople get on the phone to the technical people to find out what they can tell their clients and customers,” said Banks. “That becomes distracting.” Instead, Banks distributed an update video every 12 hours.

The CEO, upon hearing of Banks’ desire for openness internally, decided to do the same externally. While Banks admits that he didn’t agree with that decision at the time, he has since changed his stance and acknowledges that outward transparency helped the company. “The world ran out of certain skills,” said Banks. “If you wanted Azure Cloud engineers, you couldn’t get them. We used the fact that we’ve been transparent to talk to our customers and say, ‘If you haven’t been hit by this and you have some Azure cloud engineers, can we borrow them for a week?’ Sixty-five or so people flew in and actually supported our team and recovery.”

NotPetya aftermath: Focus on recovery, prevention

Overall, Banks said the total cost of the outage was $350 million including recovery costs of around $30 million. In the wake of that attack, the company learned some lessons and made some changes to how it approaches security.

Powell said that while the company has taken proactive steps to reduce the likelihood of future attacks, Maersk assumes incidents will occur again and so has made changes in how it reacts. “Unless you are a government organization or a very, very highly invested-in bank, you are not going to stop a state-sponsored cyber weapon if its targeted at you,” he said. “We were the collateral victim of a state-sponsored attack and look what it did. If you adopt a strategy around that you will fail.”

“What we needed to do is ensure that if the business was hit again, we can recover quicker. You’re not sailing anywhere with a broken engine,” said Powell. “Active Directory, DHCP, DNS run your network. You’ve got to be able to recover that capability. We assume it will be taken down and build everything around recovering and operating it.”

Where the company was able to recover some capabilities within nine days, Powell said everything is now built with the aim of being able to recover service within 24 hours.

As well as a focus on recovery, the company has adopted the NIST framework and implemented a risk-based approach around cybersecurity. The company created a triangle of risk with what it calls “extinction events” at the top flowing down into “brand disruption” events to help it prioritize where to focus its efforts.

Security learns how the business actually works

One major action this event led Maersk to take was to hit the reset button around monitoring and gain a clear vision of how the company really operates. “Visibility is king,” said Powell. “When the business changes, it will generate a vulnerability regardless what that change is. You need that visibility and you need that insight.”

“The main hurdle isn’t technology. It is business and the integration between the technology and business. There is no such thing as a divide between technology and business in any company anymore,” said Powell. “There should not be, particularly when it comes to cyber. You cannot build a cyber posture through technology. You can only build a cyber posture by integrating with the business and getting the business to understand their risk tolerance and what they’re prepared to accept.”

The security team sat down with business stakeholders and modeled the key business processes of the company and identified the ‘pinch points’ around cyber security. It then built its defenses and recovery plans around those business processes. “It means the business has to be yanked out of their comfort zone, sat in an office with a bunch of cyber folks and asked to talk through their business, which is quite hard. We’ve sat down with the business and we’ve given them accountability and ownership for cyber mitigating actions. They’re quite upset. We’ve had to educate, but it’s really working.”

As part of this goal of visibility, the company rolled out a “bring out your debt” where different parts of the business reveal exactly what systems, software and processes they had in place, even if they weren’t part of standard company operating policy. “The attack enabled Maersk to uncover how things actually operated, as opposed to how things were supposed to operate, and see the processes and the data supporting those processes that run beneath the radar,” said Powell.

“Suddenly, we understand how the whole company actually operates, where the data actually flows, and [we can] actually map and architect that. We realize what we’ve now got to do in terms of mitigations. We’ve not shot anyone, I would have liked to, but you smile and go, ‘fantastic! Let’s work together on not doing that again.’ Adopt that open mentality, find out how things actually work, map those processes, and then you know where to protect.”

Identity, vulnerabilities and hybrid SOCs

Identity and access was an area Maersk decided it need to address. After working out who had access to what systems, the company reduced that number and introduced controls to manage and monitor the usage of those privileged accounts. “It is amazing how many people have got privileged access to everything in your company,” said Powell. “It’s like giving them the all the keys to every single book. We immediately went after that and reduced that massively.”

As with many companies, Maersk has to tackle vulnerability management across a large and complex estate on an ongoing basis. “You can’t fix all your applications. On average, a Microsoft-based application has about 120 vulnerabilities that are killers,” Powell continued. “You need to be constantly having a vulnerability system to assess that and fix that on a rolling basis. Otherwise, you’re exposed.”

Thanks to the mapping of business flows, the company was able to focus its priorities, and from those 1,200 applications Banks said were critical, Maersk identified the 50 that it classes as business killers and fixed those first.

On the security operations side, Powell affirms a good hybrid SOC capability that combines on-premises with cloud-based is important. “Trust me, you need people who understand the business on prem.”

Powell is keen on validation of these changes. The company’s security posture has been audited 13 times in the last 17 months, with eight of those audits being external.

Post-cyberattack openness is important

Powell was at Capgemini at the time of the NotPetya attack and saw other organizations suffer damage. “Anybody who thinks that Maersk is the singular biggest example of what was going on is wrong. There are a lot of companies that were bigger than Maersk that were suffering, probably even worse but were not as transparent.”

Not only did being more transparent about its response and recovery progress allow Maersk to lean on partners and customers to provide the company with help it might not have otherwise received, but the company’s share price actually went up in the aftermath of the attack. Powell said that open attitude around its cybersecurity continues to pay dividends.

“Transparency is everything. Our clients at Maersk loved the fact that we told them on day one what was going on, and we included them throughout in what we were doing,” said Powell. “I’ll tell you now that we’ve retained contracts with our customers by proving that we can look after their data better than others. It’s a business winner.”