Generative AI benchmark evaluates the ability of large language models to identify and score cybersecurity threats within cloud logs and telemetries. Credit: Shutterstock / Laurent T Cloud security vendor Skyhawk has unveiled a new benchmark for evaluating the ability of generative AI large language models (LLMs) to identify and score cybersecurity threats within cloud logs and telemetries. The free resource analyzes the performance of ChatGPT, Google BARD, Anthropic Claude, and other LLAMA2-based open LLMs to see how accurately they predict the maliciousness of an attack sequence, according to the firm. Generative AI chatbots and LLMs can be a double-edged sword from a risk perspective, but with proper use, they can help improve an organization's cybersecurity in key ways. Among these is their potential to identify and dissect potential security threats faster and in higher volumes than human security analysts. Generative AI models can be used to significantly enhance the scanning and filtering of security vulnerabilities, according to a Cloud Security Alliance (CSA) report exploring the cybersecurity implications of LLMs. In the paper, CSA demonstrated that OpenAI's Codex API is an effective vulnerability scanner for programming languages such as C, C#, Java, and JavaScript. "We can anticipate that LLMs, like those in the Codex family, will become a standard component of future vulnerability scanners," the paper read. For example, a scanner could be developed to detect and flag insecure code patterns in various languages, helping developers address potential vulnerabilities before they become critical security risks. The report found that generative AI/LLMs have notable threat filtering capabilities, too, explaining and adding valuable context to threat identifiers that might otherwise go missed by human security personnel. LLM cyberthreat predictions rated in three ways "The importance of swiftly and effectively detecting cloud security threats cannot be overstated. We firmly believe that harnessing generative AI can greatly benefit security teams in that regard, however, not all LLMs are created equal," said Amir Shachar, director of AI and research at Skyhawk. Skyhawk's benchmark model tests LLM output on an attack sequence extracted and created by the company's machine-learning models, comparing/scoring it against a sample of hundreds of human-labeled sequences in three ways: precision, recall, and F1 score, Skyhawk said in a press release. The closer to "one" the scores, the more accurate the predictability of the LLM. The results are viewable here. "We can't disclose the specifics of the tagged flows used in the scoring process because we have to protect our customers and our secret sauce," Shachar tells CSO. "Overall, though, our conclusion is that LLMs can be very powerful and effective in threat detection, if you use them wisely." It's important for organizations to understand that they can't just throw data [at an LLM] and expect it to do the work for them, Shachar says. "We meticulously built our technology to be able to incorporate LLMs into real-time threat detection by utilizing the right concepts from the ground up, and now we're leveraging that to provide a glimpse into LLM performance to the broader industry to strengthen the security community. “ Skyhawk said its data will be regularly updated and available to view free of charge via its website. Related content news analysis Attackers breach US government agencies through ColdFusion flaw Both incidents targeted outdated and unpatched ColdFusion servers and exploited a known vulnerability. By Lucian Constantin Dec 06, 2023 5 mins Advanced Persistent Threats Advanced Persistent Threats Advanced Persistent Threats news BSIMM 14 finds rapid growth in automated security technology Embrace of a "shift everywhere" philosophy is driving a demand for automated, event-driven software security testing. By John P. Mello Jr. Dec 06, 2023 4 mins Application Security Network Security news Almost 50% of organizations plan to reduce cybersecurity headcounts: Survey While organizations are realizing the need for knowledgeable teams to address unknown threats, they are also looking to reduce their security headcount and infrastructure spending. By Gagandeep Kaur Dec 06, 2023 4 mins IT Jobs Security Practices feature 20 years of Patch Tuesday: it’s time to look outside the Windows when fixing vulnerabilities After two decades of regular and indispensable updates, it’s clear that security teams need take a more holistic approach to applying fixes far beyond the Microsoft ecosystem. By Susan Bradley Dec 06, 2023 6 mins Patch Management Software Threat and Vulnerability Management Windows Security Podcasts Videos Resources Events SUBSCRIBE TO OUR NEWSLETTER From our editors straight to your inbox Get started by entering your email address below. Please enter a valid email address Subscribe