Arm’s vulnerability management program has legs

CISO Tim Fitzgerald’s vulnerability management program has delivered measured improvements and earned the security team recognition as a business-enablement function.

Tim Fitzgerald, CISO and SVP, Arm
Arm

The WannaCry attack of 2017 was a wakeup call for Arm, a semiconductor IP company.

It took Arm more than two weeks to inoculate against the ransomware cryptoworm that was crippling systems worldwide, a response record that highlighted longstanding inadequacies in Arm’s ability to deal with vulnerabilities.

Tim Fitzgerald’s take was blunt: “The company’s ability to respond was compromised. The challenge was that Arm, an IP business, was running its core systems hard all the time so any downtime required for patching presented an issue for the IT department. It was my job to fix it.”

Fitzgerald joined Arm as its first CISO and senior vice president in September 2017, a position he says was created to recognize that the company needed both a more rigorous response to security events and a more mature security operation overall.

Fitzgerald has spent his tenure addressing both needs. He and his team are delivering measurable successes, with the showcase of his work being the development of vulnerability management operational excellence.

Putting attention on patching

Fitzgerald started with a focus on security fundamentals; the way he sees it, getting security basics right – although more challenging than it may seem – “takes out a big chunk of the risk.” He says that then enables the security function to focus on the remaining risk, which typically requires more specialized skills and dedicated strategies.

With that in mind, he first sought to bring better visibility and process improvement to the patching program.

Such tasks can be a challenge in many organizations, but Fitzgerald says they were particularly difficult to execute at Arm because of the company’s business objectives and the technical environment required to support them.

Arm relies on a highly complex high-performance compute infrastructure comprising several hundred thousand CPUs and running at 95%-plus capacity at all times.

“There are a lot of systems there that need to be maintained, and that’s on top of our corporate infrastructure. And those are paramount to how we test and validate our products, so there’s no tolerance for downtime,” Fitzgerald explains.

As a result, engineering continually sought to delay patching because the work would take engineering compute and infrastructure environments off line; the company had a business case that favored the strenuous product development needs for availability over patching schedules.

“These systems could not come down. But we had these huge number of systems that had to be kept up to date, and that created a real challenge for us,” Fitzgerald says.

Taking on too much risk

Yet that business case came at a cost. The need for high availability had contributed to the company’s inability to quickly and effectively manage risk and vulnerabilities, an inability that Fitzgerald inherited when he arrived at Arm. It also created growing technical debt and a lack of organizational resiliency.

The company’s response to the WannaCry threat illustrated those points, Fitzgerald says. It also showed the corresponding inefficiency of its processes, as patching systems in response to the WannaCry threat caused unplanned diversions of significant engineering and security resources from other priorities.

“We were taking on more risk than we wanted to and burning a lot of resources by not having the right processes in place,” he says.

Meanwhile, Fitzgerald was identifying other issues that impacted the company’s threat and vulnerability management capabilities. The security team had only about 50% visibility into the infrastructure, and there was tension between the groups that needed to work together to address the problem.

At the same time the company—like all others—was experiencing more attempted attacks.

But Fitzgerald had a vision of how to address all those points: “The goal was to make sure we knew all the systems that exist in our environment, that we could patch them and keep them live at the same time, which is pretty complex in Arm’s environment.”

Getting buy-in

Fitzgerald started to take steps toward that vision in early 2018.

He says he first considered the people element, including getting buy-in for his initiative, encouraging teams to work together, and giving workers the tools they needed to be successful.

He notes that even though company leaders recognized the need for improvement, he still had to sell his plans for achieving threat and vulnerability operational excellence.

“There was an awful lot of engagement and influence being applied,” he says, noting he had to educate some of his executive colleagues on the business imperatives for improvement and the challenges in achieving it. He also had to persuade some that he could create that operational excellence without impacting the business need for high availability.

Fitzgerald says he leaned into the corporate core value of “we, not I” to get teams to own the problem and the solution together.

“They’re genuinely interested in what’s good for our partners and the people who they work beside,” he adds, saying that that perspective helped him persuade teams to adopt new processes.

Fitzgerald started with the engineering environment, which supports the company’s core mission and its product development work.

Patch while flying

His team collaborated with the engineering group to size up the problem and build a business case, which engendered both shared interest in working on solving the problem and shared ownership in achieving success.

Then they re-imagined processes so that security could patch without taking the environment off line—the keystone component that made this initiative succeed.

“Historically they’d take down a data center’s assets for a weekend or whatever amount of time and patch as much as they could and bring it back up. We were significantly reducing capacity for a significant amount of time, and the business would say, ‘You can’t take it down because we have these deliveries,’” Fitzgerald says.

“So the solution couldn’t be about patching faster. We had to think about how we can do the work without taking down the environment at all. We had to think about re-engineering so we can use that 5% and patch while flying,” he says, explaining that the redesigned structure allows for iterative patching. “We found a way to keep up with this all the time. That’s where the real innovation came from.”

Racking up wins

Fitzgerald also added workers. His own security team went from three employees to 35, with the threat and vulnerability management group going from one to five. He pulled together the IT engineering group, the product engineering group, and enterprise security to tackle the work. And he invested in a new vulnerability management platform from software maker Kenna, which gave the security team visibility across the enterprise in a single pane of glass—insight that allowed the security function to prioritize needed work.

The new platform also gave workers proof of success, Fitzgerald says, as they dropped from vulnerability scores topping 900 to sub-300—a range that’s well within the acceptable risk for Arm.

That early win got others on board as security worked down its prioritized list of functional areas to tackle.

“Then it took on a life of its own, with executives championing the successes in their own sections,” Fitzgerald adds. “We had people seeking us out, rather than security finding them and saying, ‘You have a problem.’”

Fitzgerald’s approach delivered measured improvements, with engineering downtime required for patching dropping from 100% to 0, the time to remediate critical vulnerabilities cut from 14 days to 72 hours and infrastructure asset visibility jumping 50% to 100%.

At the same time, the security team managed to resolve a staggering volume of vulnerabilities, with 748,097 vulnerabilities fixed across more than 26,000 assets.

Such improvements have also made Arm more resilient, with the amount of unplanned downtime almost halved as a result.

Moreover, the success of the vulnerability management initiative has given Arm a model for other security initiatives, Fitzgerald adds.

“It set the tone for how we’re going to work with the organization, it made us a business-enablement function, and that has had tons of legs to it in terms of benefits,” he says. “This set security up to be seen as a great partner.”

Copyright © 2022 IDG Communications, Inc.

22 cybersecurity myths organizations need to stop believing in 2022