• United States



Shweta Sharma
Senior Writer

ChatGPT creates mutating malware that evades detection by EDR

Jun 06, 20237 mins
Artificial IntelligenceGenerative AIMalware

Mutating, or polymorphic, malware can be built using the ChatGPT API at runtime to effect advanced attacks that can evade endpoint detections and response (EDR) applications.

A global sensation since its initial release at the end of last year, ChatGPT‘s popularity among consumers and IT professionals alike has stirred up cybersecurity nightmares about how it  can be used to exploit system vulnerabilities. A key problem, cybersecurity experts have demonstrated, is the ability of ChatGPT and other large language models (LLMs) to generate polymorphic, or mutating, code to evade endpoint detection and response (EDR) systems.

A recent series of proof-of-concept attacks show how a benign-seeming executable file can be crafted such that at every runtime, it makes an API call to ChatGPT. Rather than just reproduce examples of already-written code snippets, ChatGPT can be prompted to generate dynamic, mutating versions of malicious code at each call, making the resulting vulnerability exploits difficult to detect by cybersecurity tools. 

“ChatGPT lowers the bar for hackers, malicious actors that use AI models can be considered the modern ‘Script Kiddies’,” said Mackenzie Jackson, developer advocate at cybersecurity company GitGuardian. “The malware ChatGPT can be tricked into producing is far from ground-breaking but as the models get better, consume more sample data and different products come onto the market, AI may end up creating malware that can only be detected by other AI systems for defense. What side will win at this game is anyone’s guess.”

There have been various proof of concepts that showcase the tool’s potential to exploit its capabilities in developing advanced and polymorphic malware.

Prompts bypass filters to create malicious code

ChatGPT and other LLMs have content filters that prohibit them from obeying commands, or prompts, to generate harmful content, such as malicious code. But content filters can be bypassed. 

Almost all the reported exploits that can potentially be done through ChatGPT are achieved through what is being called as “prompt engineering,” the practice of modifying the input prompts to bypass the tool’s content filters and retrieve a desired output. Early users found, for example, that they could get ChatGPT to create content that it was not supposed to create — “jailbreaking” the program — by framing prompts as hypotheticals, for example asking it to do something as if it were not an AI but a malicious person intent on doing harm.

“ChatGPT has enacted a few restrictions on the system, such as filters which limit the scope of answers ChatGPT will provide by assessing the context of the question,” said Andrew Josephides, director of security research at KSOC, a cybersecurity company specializing in Kubernetes. “If you were to ask ChatGPT to write you a malicious code, it would deny the request. If you were to ask ChatGPT to write code which can do the effective function of the malicious code you intend to write, however ChatGPT is likely to build that code for you.”

With each update, ChatGPT gets harder to trick into being malicious, but as different models and products enter the market we cannot rely on content filters to prevent LLMs from being used for malicious purposes, Josephides said.

The ability to trick ChatGPT into utilizing things it knows but which are walled behind filters is what can cause users to make it generate effective malicious code. It can be used to render the code polymorphic by leveraging the tool’s capability to modify and finetune results for the same query if run multiple times.

For instance an apparently harmless Python executable can generate a query to send to the ChatGPT API for processing a different version of malicious code each time the executable is run. This way, the malicious action is performed outside of the exec() function. This technique can be used to form a mutating, polymorphic malware program that is difficult to detect by threat scanners.

Existing proofs of concept for polymorphic malware

Earlier this year, Jeff Sims, a principal security engineer at threat detection company HYAS InfoSec, published a proof-of-concept white paper for a working model for such an exploit. He demonstrated the use of prompt engineering and querying ChatGPT API at runtime to build a polymorphic keylogger payload, calling it BlackMamba.

In essence, BlackMamba is a Python executable that prompts ChatGPT’s API to build a malicious keylogger that mutates on each call at runtime to make it polymorphic and evade endpoint and response (EDR) filters.

“Python’s exec() function is a built-in feature that allows you to dynamically execute Python code at runtime,” Sims said. “It takes a string containing the code you want to execute as input, and then it executes that code. The exec() function is commonly used for on-the-fly program modification, which means that you can modify the behavior of a running program by executing new code while the program is running.”

In the context of BlackMamba, “the polymorphism limitations are constrained by the prompt engineer’s creativity (creativity of input) and the quality of the model’s training data to produce generative responses,” Sims said.

In the BlackMamba proof of concept, after the keystrokes are collected, the data is exfiltrated by web hook — an HTTP-based callback function that allows event-driven communication between APIs — to a Microsoft Teams channel, Sims said. BlackMamba evaded an “industry leading” EDR application multiple times, according to Sims, though he did not say which one.

A separate proof of concept program, created by Eran Shimony and Omer Tsarfati of cybersecurity company CyberArk, used ChatGPT within the malware itself. The malware includes “a Python interpreter that periodically queries ChatGPT for new modules that perform malicious action,” according to a blog that Shimony and Tsarfati wrote to explain the proof of concept. “By requesting specific functionality such as code injection, file encryption or persistence from ChatGPT, we can easily obtain new code or modify existing code.”

While ChattyCat wasn’t meant for a specific malware type, unlike BlackMamba, it provides a template to build a huge variety of malware including ransomware and infostealers.

“Our POC, ChattyCaty, is an open-source project demonstrating an infrastructure for creating polymorphic programs using GPT models,” said Tsarfati. “Polymorphism can be used to evade detection by antivirus/malware programs.”

Shimony and Tsarfati also discovered that content filters were weaker or even  nonexistent in the ChatGPT API, as opposed to the initial online version.

“It is interesting to note that when using the API, the ChatGPT system does not seem to utilize its content filter. It is unclear why this is the case, but it makes our task much easier as the web version tends to become bogged down with more complex requests,” Shimony and Tsarfati wrote in their blog.

Regulating AI for security

Although governments worldwide are grappling with how to regulate AI to prevent harm, China is the only major nation so far that has enacted new rules. Experts propose different approaches to reining in generative AI’s potential to do harm.

“Right now the solution to controlling the issues with the AI seems to be ‘Add more AI’, which I think is probably not realistic.” said Jeff Pollard, an analyst at Forrester. “To really add the right layers of control to these solutions we need better explainability and observability for context into the system. Those should be integrated with the API and used to provide meaningful detail and offer management capabilities that at present do not seem to exist.”

However, regulating generative AI will be difficult as the technology industry is still at a nascent stage of understanding what it can do, said Chris Steffen, research director at analyst and consulting firm Enterprise Management Associate.

“The reason why regulation is a frightening prospect is that ChatGPT is one of those things where the possibilities are practically endless, and it’s not something that we can easily prepare for in a way that covers all the possible circumstances where a GPT instance could possibly cover,” Steffen said. “It’s going to be difficult especially in the areas like: how to regulate, the process to be used, and who’s accountable.”