How many potholes did you encounter on your way into work today?\u00a0 And how many of them did you report to the city?Vulnerability reporting works much the same way. Developers find bugs \u2013 and vulnerabilities \u2013 and don\u2019t always report them. That\u2019s because of the manual process to diagnose and report each one. And that manual process might be holding automated tools back.Software is assembledSoftware is assembled from pieces, not written from scratch. And when you build and deploy an app, you also inherit the risk of each of those pieces. For example, A 2019 Synopsys reports 96% of code bases [caution: email wall] they scanned included open source software, and up to 60% contain a known vulnerability.And risks don\u2019t stop there. Open source and third-party components are heavily used when you operate software. For example, 44% of indexed sites use the Apache open source web-server. A single exploitable vulnerability in the Apache webserver would have serious consequences for all of those sites.How do you determine if you\u2019re using a known vulnerable building block?\u00a0 You consult a database. They go by different names, but at the root of many of them is the MITRE CVE database.Entire industries have been created just to check databases for known vulnerabilities. For example:Software Component Analysis (e.g., BlackDuck, WhiteSource) tools for developers check build dependencies.Container scanners (e.g., TwistLock, Anchore) can check your built docker image for out-of-date libraries.Network scanners (e.g., Nessus, Metasploit) check your deployed infrastructure for known vulnerabilities.But here is the key question: where do these databases get their information?Where vulnerability information comes fromToday, most vulnerability databases are created and maintained by huge amounts of manual effort. MITRE\u2019s CVE database is the de facto standard but is populated by committed men and women who research bugs, determine their severity, and follow the manually reporting guidelines for the public good.If there\u2019s one thing we know, though, human processes don\u2019t scale well. The cracks are beginning to show.Here\u2019s the problem: automated tools like fuzzing are getting better and better at finding new bugs and vulnerabilities. And automated vulnerability discovery tools don\u2019t work well with current manual process to triage and index vulnerabilities.Google\u2019s automated fuzzingConsider this: automated fuzzing farms can autonomously uncover hundreds of new vulnerabilities each year. Let\u2019s look at Google ClusterFuzz, since their statistics are public.In Chrome:20,442 bugs were automatically discovered by fuzzing.3,849 \u2013 18.8% \u2013 of them are labeled as a security issue.22.4% of all vulnerabilities were found by fuzzing (3,849 found by fuzzing divided by the 17,161 total security-critical bugs).Google also runs oss-fuzz, where they use their tools on open source projects. So far, oss-fuzz has found over 16,000 defects, with 3,345 of them labeled as security related (20%!).Many of the security-critical bugs are never reported or given a CVE number. Why? Because it\u2019s labor intensive to file a CVE and update the database. But the fuzzer creates an input that triggers the bug. And that input sheds light for possible attackers to both locate the bug and demonstrate how to trigger it.So:We have tools that can find thousands of defects a yearMany are security-critical. In the two above, about 20%. That means hundreds of new vulnerabilities are being discovered a year.There is no way to automatically report and index these bugsYet we depend on these indexes like the MITRE CVE database to tell us whether we\u2019re running known vulnerable software.Earlier this year Alex Gaynor raised the issue of fuzzing and CVEs, with a nice summary of responses created by Jake Edge. There wasn\u2019t a consensus on what to do, but I think Alex is pointing out an important issue.I wouldn\u2019t be surprised if you could make a few thousand bucks a year taking Google\u2019s OSS feed, reproducing the results, and claiming a bug bounty.We\u2019ve evolved before...How we index known vulnerabilities has evolved over time. I think we can change again.In the early 1990s, if you wanted to track responsibly disclosed vulnerabilities, you\u2019d coordinate with CERT\/CC or similar. If you wanted the firehouse of new disclosures, you\u2019d subscribe to a mailing list like bugtraq on security focus. Over time, vendors recognized the importance of cybersecurity, and would then create their own database of vulnerabilities. It evolved to a place where system administrators and cybersecurity professionals had to monitor several different lists, which didn\u2019t scale well.By 1999 the disjoint efforts were bursting at the seams. Different organizations would use different naming conventions and assign different identifiers to the same vulnerability. It started to become really difficult to answer whether vendor A\u2019s vulnerability was the same as vendor B\u2019s. You couldn\u2019t answer the question \u201chow many new vulnerabilities are there each year?\u201dIn 1999, MITRE had an \u201caha\u201d moment and came up with the idea of a CVE List (common vulnerability enumeration). A CVE (Common Vulnerabilities and Exposures) is intended to be a unique identifier for each known vulnerability. To quote MITRE, a CVE is:The de facto standard for uniquely identifying vulnerabilitiesA dictionary of publicly known cybersecurity vulnerabilitiesA pivot point between vulnerability scanners, vendor patch information, patch managers, and network\/cyber operationsMITRE\u2019s CVE list has indeed become the standard. Companies rely on CVE information to decide how quickly they need to roll out a fix or patch. MITRE has also developed a vocabulary for describing vulnerabilities, called the \u201cCommon Weakness Enumeration\u201d (CWE). We needed both, and they served a great solution for the intended purpose: make sure everyone is speaking the same language.CVE\u2019s can help executives and professionals alike better identify and fix known vulnerabilities quickly. For example, consider Equifax. One reason Equifax was compromised was because they had deployed a known vulnerable version of Apache Struts. And that vulnerability was listed 9 weeks prior in the CVE database. If Equifax had consulted the CVE database, they would have discovered they were vulnerable a full 9 weeks before the attack.Cracks are widening in the CVE systemThe CVE system is OK but doesn\u2019t scale to automated tools like fuzzing. These tools can identify new flaws at dramatically new scale. That\u2019s not hyperbole: remember Google OSS-Fuzz \u2013 just one company running a fuzzer \u2013 identified over 3000 new security bugs over 3 years.But many of those flaws are never reported to a CVE database. Instead, companies like google focus on fixing the vulnerabilities, not reporting them. If you\u2019re a mature DevOps team, that\u2019s great; you just pull the latest update on your next deploy. But very few organizations are mature DevOps where they can upgrade all software they depend on overnight.I believe we\u2019re hitting an inflection point where real-life, known vulnerabilities are becoming invisible to automated scanning. In the beginning, we mentioned entire industries exist to scan for known vulnerabilities at all stages of the software lifecycle: development, deployment, and operations.Companies want to find vulnerabilities, but also are often incentivized to downplay any potential vulnerability that isn\u2019t well-known as super critical. Companies want to understand the severity of an issue, but judging severity is often context-dependent.Research hasn\u2019t quite caught up with the problem, but it needs to. There are several challengesFirst, the word \u201cvulnerability\u201d is really squishy, and sometimes in the eye of the beholder. Just saying \u201con the attack surface\u201d isn\u2019t enough; the same program can be on the attack surface sometimes, but not others. For example, ghostscript is a program for interpreting postscript and PDF files. It may not seem on the attack surface, but it\u2019s used in mediawiki (the wiki that powers wikipedia) to potentially process malicious user input. How would you rate the severity of a ghostscript vulnerability in a way meaningful to everyone?Second, even the actual specifications of a vulnerability are squishy. A MITRE CVE contains very little structured information. There isn\u2019t any machine-specified way to even determine if a bug found qualifies for a CVE. It\u2019s really up to the developer, which is appropriate when developers are actively engaged and can investigate the full consequences of every bug. It\u2019s not great otherwise.Third, the naming for various types of vulnerabilities \u2013 or in MITRE-speak, \u201cweaknesses,\u201d is squishy. \u00a0CWE\u2019s were intended to become the de-facto standard for how we describe a vulnerability, just like CVE\u2019s are for listing specific flaws. But today automated tools can find buffer overflows and demonstrate them, and correctly label them with CWE types for input validation bug, a buffer overflow, or out-of-bounds write, each of which can be argued is technically correct.Overall, I believe we need to rethink CVE\u2019s and CWE\u2019s so that automated tools can correctly decide a label, and automated tools for calculating a severity. Developers don\u2019t have time to investigate every bug, and what may be a security consequence. And they\u2019re focused on fixing the bug before them; not making sure anyone using the software has the latest copy.We also need a machine-checkable way of labeling the type of bug to replace the informal CWE definition. Today CWE\u2019s are designed for humans to read, but that\u2019s too underspecified for a machine to understand. Without this, it\u2019s going to be hard for autonomous systems to go the extra mile and hook up to a public reporting system.In addition, we need to think about how we prove whether a vulnerability is exploitable. In 2011 we started doing research into automated exploit generation with the goal to show whether a bug could result in control flow hijack. [We turned off OS-level defenses that might mitigate an attacker exploiting a vulnerability such as ASLR. The intuition is that the exploitability of an application should be considered separately from whether a mitigation makes exploitation harder.] In the 2016 DARPA Cyber Grand Challenge, all competitors needed to create a \u201cProof of Vulnerability,\u201d such as showing you could control a certain number of bits of execution control flow. Make no mistake: this is early work and there is a lot more to be done to automatically create exploits, but it was a first step.One question, though, is whether \u201cProofs of Vulnerabilities\u201d are for the public good. The problem: just because you can\u2019t automatically prove a bug is exploitable (or even manually) doesn\u2019t mean the bug isn\u2019t really security critical.For example, in one of our research papers we found over 11,000 memory-safety bug in Linux utilities, and could create a control flow hijack for 250 \u2013 about 2%. That doesn\u2019t mean the other 98% are unexploitable. It doesn\u2019t work that way. Automated exploit generation confirms a bug is exploitable, but it doesn\u2019t reject a bug as unexploitable. It also doesn\u2019t mean that the 250 discovered were exploitable in your particular configuration.We saw similar results in the Cyber Grand Challenge. Mayhem could often find crashes for really exploitable bugs, but Mayhem wasn\u2019t able to create an exploit. The same was reported by other teams. Just because an automated tool can\u2019t prove exploitability doesn\u2019t mean the bug isn\u2019t security critical.One proposalI believe we need to set up a machine checkable standard for when a bug is likely of security interest. For example, Microsoft has a \u201c!exploitable\u201d plugin for their debugger, and there is a similar version for GDB. These tools are heuristics: they can have false positives and false negatives.We should create a list \u2013 similar to CVEs \u2013 where fuzzers can submit their crashes, and each crash is labeled as likely exploitable or not. This may be a noisy feed. But the goal isn\u2019t for human consumption \u2013 it\u2019s to give a unique identifier. And those unique identifiers can be useful to autonomous systems that want to make decisions. It can also help us identify the trustworthiness of software. If a piece of software has 10 bugs that have reasonable indications that they are real vulnerabilities, but no one has proved it, would you still want to field it?I don\u2019t think the answer is to bury them, but to index them.I also don\u2019t think I\u2019m alone. Google, Microsoft, and others are prioritizing their developer workflow more and more on autonomous systems. It makes sense to make that information available to everyone who fields the software as well.I started this article asking the question on whether autonomy will be the death of CVEs. I don\u2019t think so. But I do think autonomous systems will need a separate data source \u2013 something updated much faster and designed for machines \u2013 than a manually curated list to be effective.Key takeawaysExecutives should continue to use scanners for known vulnerabilities but understand they don\u2019t represent the complete picture.The appsec community should think hard about how to better incorporate tools like fuzzers into the workflow. We\u2019re potentially missing out on a huge number of critical bugs and security issues.One proposal:Add structure to CVE and CWE databases that is machine parsable and usable.Create a system where autonomous systems can report problems, and other autonomous systems can consume the information. This isn\u2019t the same fidelity as human-verified \u201cthis is how you would exploit it in practice\u201d, but it would help us move faster.Agree or disagree? Let me know.