The rapid increase of GenAI adoption has been significant, but only 38% of organizations are tackling cybersecurity challenges when using Large Language Models. How has the threat landscape changed and which new threats have emerged? And, crucially, what industry guidance is available to ensure a secure GenAI adoption?
In this article, we will discuss secure GenAI adoption, in particular:
- Challenges of secure GenAI’s adoption
- LLM threats and challenges
- Industry guidance for a secure GenAI adoption
Since the rapid adoption of GenAI late last year, the adoption of Large Language Models (LLM) applications has drastically increased, where 60% of organizations with reported AI adoption already leverage GenAI. Generative AI (GenAI) is a subset of artificial intelligence that trains models to generate content, while Large Language Models (LLM) are AI models trained on vast amounts of text to understand and generate human-like text. However, it’s still very early days for GenAI security, where only 38% of organizations are addressing cybersecurity risk for GenAI, the majority of these focused mostly on handling the inaccuracy of LLMs. On the other hand, advanced cyberattacks for phishing have increased 135% during the first two months in 2023 by leveraging novel social engineering methods for phishing optimized based on the victim’s online information.
The adoption of GenAI and the usage of LLMs commonly outpace the adoption of adequate security controls and an AI security strategy to incorporate LLM-specific challenges and risks. This absence of guidance in addressing the challenges and risks resulted in the recent release of industry guidelines to ensure the secure and safe adoption of GenAI across industry bodies and governments worldwide.
Threats and Risk for LLM applications
One of the major importance for understanding risks and threats specifically is the OWASP Top 10 LLM, which will be discussed alongside other relevant industry frameworks for securely adopting GenAI and protecting LLM applications. Organizations that are looking to adopt GenAI need to understand these threats and risks and extend their strategy to incorporate relevant guidance into a comprehensive AI security strategy. In the following, the major LLM risks and threats according to the OWASP Top 10 for LLM are discussed, alongside relevant examples.
Prompt injection is the most prevalent GenAI attack, allowing manipulation of the LLM through malicious input, causing unintended actions by the LLM. Like SQL injection attacks targeting database information, prompt injection targets the data processed by an LLM. A prompt injection can be categorized either as direct or indirect attack.
Direct prompt injection entails a malicious user manipulating the LLM to gain access to confidential information or bypass filters – a process known as ‘jailbreaking’. However, it’s the indirect prompt injection that poses another prevalent threat, resulting in additional vulnerabilities. In an indirect prompt injection attack, an LLM is deceived into accepting hazardous instructions embedded in external sources, such as websites.
Researchers from Carnegie Mellon University have recently published information about advanced prompt injection attacks called “adversarial attacks”, allowing to manipulation of an LLM to follow harmful commands, applicable across multiple types of LLM. They formulated unique sequences of character inputs, which, when added to a user query, led the LLM to output prohibited responses, such as revealing how to steal an identity. For instance, researchers could reveal how to steal someone’s identity – a response that would be prohibited. Unlike traditional jailbreaks, this highly automated method is broadly exploitable and can be transferred across different LLMs, multiplying the risk factor.
Illustration: Adversarial Attack revealing unauthorized response from LLM
Insecure Output Handling
Insecure Output Handling is a vulnerability that occurs when LLM outputs aren’t properly validated before being passed downstream. Exploitation may result in remote code execution such as cross-site scripting (XSS) or Server-Side Request Forgery (SSRF).
For example, researchers were able to manipulate a chatbot to execute cross-site scripting, SQL injection, or system commands. A critical vulnerability that allows arbitrary code execution has been reported in the LLM framework “Langchain” earlier this year.
Training Data Poisoning
Training data poisoning is a vulnerability caused by tampering with the data or fine-tuning the LLM training data. This vulnerability may introduce unexpected consequences like bias, further vulnerabilities, or backdoors that can potentially compromise the model’s security, effectiveness, or ethical behavior. External data sources increase such risks due to a lack of control and confidence in the data’s authenticity and neutrality.
Another example of this attack has been seen against spam filters used by Google, where spammer groups are trying to circumvent Google mail filters by reporting massive amounts of spam emails as not spam to skew the classification.
Supply Chain vulnerabilities
Supply chain vulnerabilities in LLM impact the integrity of training data, ML models, and deployment platforms. This may lead to biased outcomes, data breaches, or system failures. Researchers were able to perform an attack on vulnerable components of the open-source LLM of Hugging Face’s platform, resulting in poisoned training data. It was possible to modify the open-source model to make it disseminate misinformation like answering specific questions incorrectly while retaining its accuracy for other tasks.
Insecure Plugin Design
GenAI providers like OpenAI provide an ecosystem of plugins to provide additional capabilities like browsing the internet. Plugins however can be prone to vulnerabilities causing malicious actions like data exfiltration, and remote execution.
This may also cause privilege escalation due to the “confused deputy” problem, where the user can trick the LLM into performing tasks that it shouldn’t allow. Researchers have recently reported vulnerabilities through insecure plugins inChatGPT, acting as the confused deputy, allowing access to private data, and carrying out malicious actions on the user’s behalf without their consent or knowledge.
Overreliance on large language models (LLMs) can lead to misinformation or unsafe outputs, often termed as hallucination or fabulation. While LLMs can be beneficial, there’s a risk of depending on them without oversight, leading to inaccurate or harmful content. This can cause misinformation, miscommunication, legal issues, and security vulnerabilities.
An example is a lawyer who used ChatGPT for legal research and submitted fabricated cases to the court. Despite trying to verify their authenticity with ChatGPT, the AI confirmed them as valid. The lawyer now faces sanctions and regrets relying on AI without proper verification.
An example of overreliance is when a news agency relies heavily on an LLM for articles. Someone tricks the LLM into spreading false news, damaging the agency’s reputation and trustworthiness.”
Model theft is a risk due to unauthorized access and illegal copying of a proprietary LLM, including its details like weights and parameters. When models are stolen and leaked, this can harm an organization’s reputation, money, and competitive advantage. An example of model theft is the leakage of Meta’s model LLaMA along with its weights, resulting in the entire release of the proprietary LLM.
Paving the way for secure GenAI adoption
Considering the specific threats and risks associated with LLMs, it’s paramount for an organization to pave the way for secure GenAI adoption. This can be achieved by adopting a proactive approach and establishing an AI security and risk management program.
GenAI Security Risk and Mitigation
Leverage GenAI Security Best Practices & Frameworks
Google has released the Secure AI framework (SAIF) for organizations to provide a conceptual framework for securing AI systems. The framework mandates to:
- Proactive threat detection and response for LLMs, leveraging threat intelligence, and automating defenses against LLM threats.
- Harmonize platform security controls to ensure consistency such as enforcing least privilege permissions for LLM usage and development.
- Adaptation of application security controls to LLM-specific threats and risks
- Feedback loop when deploying and releasing LLM applications.
- Contextualize AI risks in surrounding business processes.
By integrating these principles from the SAIF, organizations can improve their security posture in LLM applications.
AI Risk Management Program
To effectively manage GenAI risk, performing threat modeling for LLM applications is crucial, especially focusing on the major LLM threats discussed previously. To address these challenges comprehensively, an AI Risk Management Program is essential. In line with this, NIST has released the AI Risk Management Framework, specifically tailored for organizations looking to manage AI risk that engaged in the AI system lifecycle. The core objective of this framework is to manage AI-associated risks effectively and champion the secure and responsible implementation of AI systems.
Reduce AI Data Pipeline Attack Surface & LLM Data Validation
Protecting the AI data pipeline is also important, especially against LLM threats like training data poisoning or supply chain vulnerabilities. Measures such as data validation for model parameters and validating model input and outputs allow to improve LLMs’ security posture during model training and use.
LLM Application Security
GenAI security controls should build upon and extend existing application security programs, specifically targeting GenAI-associated risks and threats. Also, incorporating governance oversight and establishing guardrails ensures responsible and controlled AI deployment and usage.
Threat Management and Least Privilege
Effective threat detection and response are also crucial for GenAI security. By leveraging threat intelligence, organizations can concentrate on LLM-specific threat use cases, such as prompt injection or model theft, ensuring a swift and appropriate response. Adopting the principle of least privilege to restrict model and user prompt access, also helps to minimize the attack surface and reduce vulnerabilities.
The rapid increase in GenAI and LLM applications since late last year underscores the significance of secure GenAI adoption While these technologies are becoming more integral to various industries, many organizations are not sufficiently addressing the cybersecurity risks associated with LLM. As such, there’s an urgent need to address these risks and threats.
To ensure the secure adoption of GenAI, organizations should comprehend the unique risks tied to LLM and adhere to GenAI Security Best Practices and Frameworks. Establishing a comprehensive AI Risk Management program leveraging the NIST AI Risk Management Framework is pivotal. It’s also vital to extend LLM-specific threats to the existing application security program, including threat modeling. Moreover, proactive a Threat Management strategy leveraging threat intelligence for LLM, coupled with the principle of Least Privilege, should form the core of an AI security program.
For an enhanced understanding of LLM risks and managing those, organizations can turn to industry guidance and the OWASP Top 10 for LLM, Google SAIF, and the NIST AI Risk Management framework. As the landscape of GenAI develops, synchronizing its adoption with this security guidance is crucial for its safe and responsible progression.
About the author
Lead Cloud Security Architect
Dan Gora is a Lead Cloud Security Architect
Dan is a Lead Cloud Security Architect at Eviden based out of Edinburgh, Scotland. He has over a decade of professional service experience in cyber security, particularly in Cloud and Application Security and DevSecOps Transformation. He has delivered cyber security transformation programs for medium and large enterprises. Dan is also actively involved in industry forums as a Board Member and City Chapter Lead for the OWASP Foundation and the Cloud Security Alliance (CSA).
Our AI-Powered threat actor tracker just got even smarter