September 17, 2024

How to mitigate OWASP LLM risks when using public LLMs?

Mapping of OWAP LLM Top 10 issues with public LLMs to new controls required

OWASP LLM top 10 describes and prioritizes security risks related to large language models and GenAI. Many of those risks did not exist before LLMs. Those include manipulation of results by injecting specific directives to the prompts or by feeding modified training data. They also mention several risks related to created content, like the end users trusting the responses too easily. Nearly all risks concern both in-house developed or SaaS based AI systems. While the enterprise has some additional control over in-house developed GenAI applications, public LLMs are largely just web fronts that provide little visibility to the actual usage.

What are the responsibilities of users and makers of public LLMs in addressing the risks? While the risks have evolved, the well established shared responsibility model gives a useful reference. For a SaaS application, or any cloud service for that matter, it is still the customer's responsibility to control access, what data is sent to the service and what’s done with the outputs. In the case of a public LLM, like ChatGPT, the LLM does not constrain what data is used in prompts. Neither it limits the potential use cases, which leaves it entirely to the end user to judge how much to rely on the created content.

The above really says that even if every LLM-based SaaS app maker did their homework perfectly, addressing the OWASP LLM Top 10 requires specific capabilities to secure the use of these exciting applications. Some of the issues are actually matters of policy. Overreliance (#9 on the Top 10) is a good example. No app vendor can be held responsible for how much you invest in accuracy verification and what you do with the generated content.

The top 10 risks do not require 10 solutions to secure your business users with public LLMs. There are a critical few that mitigate several issues:

  • Authenticating and monitoring every end user interaction with public LLMs can likely deter many attempts to manipulate or exfil data using prompt injections. Logging tagged against corporate id’s will also help greatly in investigating suspicious activities.
  • Controls for data uploads are essential to avoid data leakages (#6 on the list). They need to cover prompts and attachments. Relying on existing classifications may not be enough because of all the copy/pasting that takes place when crafting prompts.
  • Similar inspection of the results is very helpful in mitigating overreliance and other risks with outputs. Overreliance risk is amplified when the generated content is of nature that the user cannot easily verify. Company counsel can review ChatGPT-created contract clauses and get productivity benefits from GenAI. For an ‘amateur attorney’ in the sales team, hallucinations when interpreting a contract may go unnoticed. Declaring some content categories off-limits to some personnel groups can reduce the risk in a meaningful way..
  • Let’s not forget the soft controls with the business users. We can all be realistic of the short term effect of employee training. Better is to guide the users in real time, and use that to subtly remind them of ongoing monitoring. How likely is it that somebody will repeat a jailbreaking attempt after getting a personalized notification that the attempt did not go unnoticed?

There is an app vendor factor here. Some of the issues one might expect the vendor to protect for. One might expect filters against e.g., prompt injection (#1), like jailbreaking. Now, do you trust that the protections are there? Can you verify them? Or do you need to have the capabilities in your own hands?

Public LLMs are expanding the digital workspace with an unprecedented productivity promise. While the technology may be revolutionary, it is of a very different nature from the SaaS apps that enterprises have been purchasing for a decade now. NROC was founded to enable business users to safely take advantage of Generative AI technologies. We all are in the early innings of this, and look forward to innovating with our customers. More information about NROC Security please see our website at www.nrocsecurity.com

Appendix: Mapping of OWAP LLM Top 10 issues with public LLMs to new controls required

Selected OWASP LLM Top 10

Risks with public LLMs

New controls required

Prompt injection (#1)

Modification of outputs by manipulation of inputs

Monitoring of every interaction; detection of jailbreaking tactics

Insecure output handling (#2)

Output may be executable, and contain unauthorized code

Control acceptable use also for outputs (e.g., is generating code even allowed)

Training data poisoning (#3)

Manipulation of customer specific finetuning or grounding data to affect outputs

Vetting data transfers against the desired use case and monitoring every flow

Denial of service (#4)

Cost increases or service disruptions by a malicious actor that is over consuming resources

Authentication and monitoring of every use; detection of usage exceeding rate limits

Supply chain vulnerabilities (#5)

There is LLM inside the ‘black box’ that may be vulnerable

Risk profiling of applications, inducing the core stack elements

Sensitive information disclosure (#6)

Range of accepted inputs may be unconstrained and may get aggregated with other customers’ data in training. That aggregate data is then subject to exfil risks

Data category level controls for either blocking a type of content, or whitelisting only certain content for a certain AI. Separately, a pattern detection of exfil attempts in outputs

Insecure plugin design (#7)

LLM-based SaaS, if accessed via APIs, may effectively become ‘plugins’ of your core systems and may introduce vulnerabilities

Many of the end user related policy controls applied also to API-based consumption

Excessive agency (#8)

Again, over APIs, concerns over how much feature set is accessed and how autonomously the app can operate

Mostly a system design issue where use case controls of the external SaaS app need to be on the customer side

Overreliance (#9)

Underinvestment in results verification against hallucinations or unwanted content, particularly in areas where verification requires expertise

Use case guardrails to keep usage in areas where outputs are trusted, and labeling of outputs for downstream business processes

Model theft (#10)

Attempts to extract intellectual property of the app maker

Controls to detect exfil, while authenticating every use and monitoring every interaction

Get insights on boosting GenAI app adoption safely

Subscribe to NROC security blog

Visibility
Productivity
Guardrails

What to do now to prepare for EU AI Act coming to effect?

EU AI Act was approved 1st August 2024. It has transition period until August 2027, but for majority of companies, the transition time ends already 2nd August 2026 when most of the controls and tech needs to be applied to organisation.

Productivity
User behavior risks

When “BYOD” becomes “BYOCWATWD”

Minimize the dangers when employees “bring your own computer with AI to work - day

User behavior risks
Productivity
Visibility

Learnings from early adopters about effective AI task forces

6 learnings that characterize the more effective AI task forces

Productivity
Visibility

How to drive Gen AI adoption: Example AI Task Force charter

Best-practice task force charter that can help newly formed AI task forces to articulate their mission and focus their efforts

Safely leverage the advantages of GenAI apps for maximum productivity