Why code reuse is still a security nightmare

Despite best efforts to track software dependencies, blind spots still exist leading to silent vulnerabilities in software.

man in bed nightmare what keeps you up at night stress sleeping by gorodenkoff getty images
Gorodenkoff / Getty Images

Modern software applications are stitched together from thousands of third-party components fetched from public repositories. This reuse of code has major benefits for the software industry, reducing development time and costs and allowing developers to add functionality faster, but it also generates major vulnerability management problems due to the complex system of dependencies that are often hard to track.

Vulnerabilities inherited from third-party code have plagued applications for years, but in the age of government-sponsored software supply chain attacks, the problem is more relevant than ever. Software composition analysis tools can help uncover some of these risks, but subtle dependency blindspots still exist that make it hard for even security-conscious developers to catch all inherited flaws.

A recent scan of the NuGet repository by security researchers from ReversingLabs uncovered 50,000 packages that were using an outdated and vulnerable version of a popular library called zlib. Many of them did not explicitly list it as a dependency.

Dependency tracking is hit and miss

In order to discover all vulnerabilities, developers need to track not only which components they use in their own applications, but also the third-party libraries and packages those components are based on. The dependency chains can go many layers deep. An analysis performed in 2019 by researchers at Darmstadt University on the npm repository found that on average importing a JavaScript package introduced implicit trust for 79 other packages from 39 different maintainers. At the time, the researchers also found almost 40 percent of packages relied on code with at least one publicly known vulnerability.

One problem is that only the dependencies that relate to packages on the same repository are tracked by package repositories and their corresponding package management tools. But that's not the only way third-party code makes it into projects. Some developers statically link libraries or manually compile code from other projects that live outside package repositories and this information is not easy to find with automated scanning tools.

ReversingLabs found over 50 NuGet packages that contained actively exploited vulnerabilities because they bundled outdated and vulnerable versions of 7Zip, WinSCP and PuTTYgen. These are popular compression or network connectivity programs that are not directly hosted on NuGet, but might have wrapper packages available for them on NuGet created by other developers.

NuGet is the main repository for the .NET programming language and the majority of the components hosted there are shipped as ZIP archives with the extension .nupkg and contain precompiled Windows .DLL libraries that are meant to be imported into other software projects.

One vulnerable NuGet package found by ReversingLabs is called WinSCPHelper and is a wrapper library for WinSCP. It allows applications that integrate it to manage files on remote servers via the SFTP protocol. WinSCPHelper hasn't been updated on NuGet since 2017, but the last version was downloaded over 34,000 times since it was released and around 700 times in the past 6 weeks. The latest WinSCP version is 5.17.10 and contains a patch for a critical remote code execution vulnerability, but the version bundled with WinSCPHelper is a much older one—5.11.2.

"While in this case the analyzed package clearly states that it uses WinSCP, it doesn’t disclose the version in the list of dependencies, and you can’t easily find out which vulnerabilities affect its underlying dependency," the researchers said. "It is manual work, still doable, but it requires some effort."

Identifying silent vulnerabilities

But tracking dependencies can be even harder than that. Take the case of zlib, one of the most widely used open-source data compression libraries that was originally written in 1995. This library has become an almost de-facto standard and is provided by its maintainers as source code. This means developers tend to compile it themselves and link it statically in their projects, often without mentioning its presence since it's so ubiquitous.

Through static file analysis, ReversingLabs identified over 50,000 NuGet packages that use zlib version 1.2.8, which was released in 2013 and contains four vulnerabilities of high or critical severity. Some of the identified packages inherited this old zlib version and its vulnerabilities through other third-party components that are not clearly listed as dependencies, prompting the researchers to refer to these as silent vulnerabilities.

One example provided by ReversingLabs is a NuGet package called DicomObjects that implements the Digital Imaging and Communications in Medicine (DICOM) protocol. DICOM is a standard used to transmit and manage medical imaging data. It's widely used in hospitals and is supported by many imaging devices such as medical scanners, printers, servers, and workstations.

DicomObjects, which is used by healthcare software developers to easily build DICOM solutions, has almost 54,000 downloads and is maintained by an UK-based company called Medical Connections. The package lists Microsoft.AspNet.WebApi.Client, Newtonsoft.Json and System.Net.Http as dependencies, but according to ReversingLabs, it also bundles a commercial PDF library called ceTe.DynamicPDF.Viewer.40.x86.dll that is not explicitly mentioned anywhere. DynamicPDF Viewer is listed on NuGet as a separate package but the version bundled in DicomObjects is a much older one that includes zlib 1.2.8.

"This is one of the most common software maintenance problems," the researchers said. "Developers create a software package, decide to use third-party software, but during subsequent updates, the dependencies get overlooked. In this case, things are even worse because it is not explicitly mentioned anywhere that the DicomObjects package depends on DynamicPDF.Viewer. There is no way to tell that DynamicPDF.Viewer depends on the vulnerable zlib library. Stacking hidden dependencies in such a way leads to multiple levels of silent vulnerabilities and makes software maintenance and auditing significantly more difficult."

Medical Connections did not immediately respond to a request for comment.

Another example is a highly popular package called librdkafka.redist, a C library that implements the Apache Kafka protocol. Apache Kafka is an open-source high-performance stream processing framework for handling real-time data feeds. The librdkafka.redist package has 18.9 million downloads, of which 312,000 are for the latest version, 1.7.0, that was released 2 months ago. This version of librdkafka.redist uses zlib 1.2.8, but this is not explicitly stated in the project's dependency list on either NuGet or GitHub.

The issue was reported on the project's bug tracker on GitHub over a year ago and is currently flagged for fixing in version 1.8.0. The project's lead developer, Magnus Edenhill, reviewed the four zlib vulnerabilities and said that only two of them apply to librdkafka and that the risk of successfully exploiting them through Kafka consumed messages seems very low. Edenhill did not immediately respond to a request for comment.

Thirteen other NuGet packages depend on librdkafka.redist, including some developed by a data infrastructure company called Confluent that has many large enterprise customers.

"Secure software development is a complex problem, as it involves many participants across multiple stages of development," the ReversingLabs researchers said. "Regardless of what type of software your company produces, sooner rather than later, there will be a need to include third-party dependencies into your solution. This will introduce a need to manage security and code quality risks. Software supply chain attacks are a growing threat to the cyber community. They are the DDoS analog to traditional breaches."

Supply chain risks

NuGet is not the only package repository where this vulnerable dependency problem exists and one could argue that it's not up to NuGet or other repositories to force developers to pay more attention to these issues. However, some platforms are more proactive than others. GitHub actively scans the public code repositories hosted on its platform, analyzes their dependencies and notifies their owners if any of those dependencies have known vulnerabilities. The company maintains a public advisory database with known vulnerabilities in npm (JavaScript), RubyGems (Ruby), NuGet (.NET), pip (Python), Maven (Java) and just announced support for Go modules.

In its 2020 Software Supply Chain Report, open-source governance company Sonatype noted year-over-year growth of 430% in the number of next-generation attacks where hackers tried to actively inject malware into open-source software projects in an attempt to poison additional projects and applications higher up on their dependency chain. Traditional attacks where hackers exploit known vulnerabilities in open source components have continued strong, but the time to exploit has decreased with attackers exploiting newly discovered vulnerabilities within a few days of their public disclosure. Meanwhile, half of companies take over a week to learn about such flaws and a week or more after that to put mitigations in place.

Attackers are clearly interested in exploiting the software supply chain, yet thousands of software packages with inherited vulnerabilities still sit in public repositories and serve as foundation blocks for enterprise software.

Copyright © 2021 IDG Communications, Inc.

7 hot cybersecurity trends (and 2 going cold)