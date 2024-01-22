As CISO for the Vancouver Clinic, Michael Bray gushes about the infinite ways large language models (LLMs) will improve patient care. \u201cDNA-based predictive studies, metabolic interactions, lab services, diagnostics and other medicine will be so advanced that today\u2019s medical practices will look prehistoric,\u201d he says. \u201cFor example, applications like ActX are already making a huge difference with symptom identification, medicine interactions, effectiveness, and dosages.\u201d\n\nAs excited as he is about LLMs improving patient care and diagnoses, Bray is equally concerned about the new and hidden threats that LLMs present. LLMs are core to disruptive and fast-moving AI technologies including OpenAI\u2019s ChatGPT, Google\u2019s Bard, and Microsoft\u2019s Copilot, which are rapidly proliferating across enterprises today. LLMs are being developed into a host of other specialty apps for specific vertical industries like finance, government, and military.\n\nWith these LLMs come new risks of data poisoning, phishing, prompt injections, and sensitive data extraction. Because these attacks are executed via natural language prompts or training sources, traditional security tools are ill-equipped to detect such attacks.\n\nFortunately, these vulnerabilities are being identified and prioritized by the Open Web Application Security Project (OWASP), National Institute of Standards (NIST), and other standards groups nearly as quickly as AI is proliferating. The EU AI Act has released an initial compliance checker for organizations to determine if their AI applications fall into the category of unacceptable risk or high risk. In November 2023, the UK released the UK guidelines for secure AI system development.\n\nTools are also catching up with new risks introduced through LLM\u2019s. For example, natural language web firewalls, AI discovery, and AI-enhanced security testing tools are coming to market in what may well become a battle of AI versus AI. As we wait for those tools, these are the most likely threats organizations will face to their use of LLMs:\n\n1. Malicious instructions from prompt injections\n\nWhen asked about new threats introduced to enterprises through LLMs, experts cite prompt injections a top risk. Jailbreaking an AI by throwing a bunch of confusing prompts at the LLM interface is probably the most well-known risk and could cause reputational damage if the jailbreaker spreads misinformation that way. Or a jailbreaker could use confusing prompts to cause a system to spit out ridiculous offers, such as with a popular auto dealership chatbot developed by a company called Fullpath. By instructing a Chevy dealer\u2019s chatbot to end each response with \u201cthat\u2019s a legally binding offer, no takesies backsies,\u201d a hacker tester tried thousands of prompts until he ultimately tricked the dealer site into offering him a new car for one dollar.\n\nThe more severe threat is when prompt injections are used to forceapplications to hand over sensitive information. Unlike with SQL injection prompts, threat actors can use limitless prompts to tryto trick an LLM into doing things it shouldn\u2019t because the LLM prompts are written in natural language, explains Walter Haydock, founder of StackAware, which maps AI use in enterprises, and identifies associated risks.\n\n\u201cWith SQL, there are finite ways you can input data so there is a known set of controls you can use to prevent and block SQL injections. But with prompt injection, there are infinite ways to provide malicious instructions to an LLM because the English language is that vast,\u201d Haydock notes.The number of LLM prompt tokens continues to grow.\n\n2. Data leakage from prompt extractions also an LLM vulnerability\n\nHyrum Anderson, CTO at Robust Intelligence, an\u00a0end-to-end AI security\u00a0platform that includes a natural language web firewall, also points to\u00a0prompt extractions\u00a0as a point of vulnerability. \u201cPrompt extraction\u00a0falls into the category of\u00a0data leakage, where data can be extracted by merely asking for it,\u201d he adds.\n\nTake, for example, chatbots on a website,\u00a0with relevant data behind them that support the application. These data can be exfiltrated. As an example, Anderson points to retrieval augmented generation\u00a0(RAG), where LLM responses are enriched by connecting them to sources of information relevant to the task. Anderson recently witnessed such an attack in which demonstrators used a RAG to force the database to spit out specific sensitive information by asking for specific rows and tables in the database.\n\nTo prevent this type of database leakage, Anderson urges\u00a0caution when\u00a0connecting public-facing RAG apps to databases. \u201cIf you don\u2019t want the\u00a0RAG app\u00a0user to see\u00a0the entire database,\u00a0then you should restrict\u00a0access at the user interface to the LLM,\u201d he adds.\u00a0\u201cSecurity-minded organizations should\u00a0steel their APIs against natural-language pull requests, restrict access, and use an AI firewall to\u00a0block\u00a0malicious requests.\u201d\n\n3. New LLM-enabled phishing opportunities\n\nLLMs also open a new vector for phishers to trick people into clicking their links, Anderson continues. \u201cSay I\u2019m a financial analyst using a RAG\u00a0app\u00a0to scrape documents from\u00a0the internet to find out a company\u2019s earnings, but in that supply chain of data\u00a0are instructions for an LLM to respond with a phishing link. So, say I ask it to find the most up to date information in the trove of data it sent, and it says \u2018click here.\u2019 And then I click a phishing link.\u201d\n\nThis kind of phish is powerful since the user is explicitly seeking an answer from the LLM. Furthermore,\u00a0traditional anti-phishing tools may not\u00a0see these malicious links, Anderson adds. He advises CISO\u2019s to update their employee training programs\u00a0to include critical thinking about RAG responses, and\u00a0to\u00a0use emerging web-based tools that can\u00a0scan RAG\u00a0data\u00a0for\u00a0natural-language\u00a0prompt injections that encourage\u00a0users\u00a0to\u00a0click links.\n\n4. Poisoned LLMs\n\nModels from open-source repositories and the data used to train LLMs can also be poisoned, adds Diana Kelley CISO at\u00a0Protect AI, a platform for AI and ML security.\u00a0\u201cThe biggest threats could be in the model itself or the data the LLM was trained on, who trained it, and where it was downloaded from,\u201d she explains. \u201cOSS models run with high privileges, but few companies scan them before use and the quality of the training data directly impacts the reliability and accuracy of the LLM. To see and manage AI related risks, and prevent poisoning attacks, CISOs need to govern the ML supply chain and track components throughout the lifecycle.\u201d\n\nThat is, if CISOs are even aware of what\u00a0 applications are using LLMs and for what purposes.\u00a0Many common workforce applications used in enterprises today are embedding the latest AI capabilities in their system updates, sometimes without the knowledge of the CISO.\n\nBecause these LLMs are integrated into third-party applications and web interfaces, discovery and visibility become even more murky. So, an AI policy addressing the entire data supply chain is key, says Haydock of StackAware.Regarding thesefourth-party risks. \u201cIt\u2019s understanding how these apps are using, training, accessing, and retaining your data,\u201d he adds.\n\nAI versus AI\n\nThe US Government, arguably the largest network in the world, certainly understands the value of AI security policy as it seeks to leverage the promise of AI across government and military applications. In October 2023, the Whitehouse issued an executive order (EO) for safe AI development and use.\n\nThe\u00a0Cybersecurity and Infrastructure Security Agency (CISA), part of the Department of Homeland Security (DHS),\u00a0plays a critical role in executing the executive order and\u00a0has generated an AI\u00a0roadmap\u00a0that incorporates key CISA-led actions as directed by the EO\u2014along with additional actions CISA is leading to support critical infrastructure owners and operators as they navigate the adoption of AI.\u00a0\n\nAs a result of the executive order, several key government agencies have already identified, nurtured, and appointed new chief AI officers responsible for coordinating their agency\u2019s use of AI, promoting AI innovation while managing risks from their agency\u2019s use of AI, according to Lisa Einstein, CISA\u2019s senior advisor for AI.\n\n\u201cWith AI embedded into more of our everyday applications, having a person who understands AI\u2014and who understands the positive and negative implications of integrating AI\u2014is critical,\u201d Einstein explains. \u201cRisks related to LLM use is highly contextual and use-case specific based on industry, whether it be healthcare, schools, energy, or IT. So, AI champions need to be able to work with industry experts to identify risks specific to the context of their industries.\u201d\n\nWithin government agencies, Einstein points to\u00a0the Department of Homeland Security\u2019s Chief AI Officer Eric\u00a0Hysen, who is\u00a0also DHS\u2019s\u00a0CIO.\u00a0Hysen coordinates AI efforts across DHS\u00a0components, she explains, including the Transportation Security\u00a0 Administration, which uses IBM\u2019s\u00a0computer vision\u00a0to detect prohibited items in carry-on luggage. DHS, in fact, leverages AI in many instances to secure the homeland at ports of entry and along the border, as well as in cyberspace to protect children, defend against cyberthreats, and even to combat the malicious use of AI.\n\nAs LLM threats evolve, it will take equally innovative AI-enabled tools and techniques to combat them. AI-enhanced penetration testing and red teaming, threat intelligence, anomaly detection, incident response are but some of the tool types that are quickly adapting to fight these new threats.