Red Teaming LLM Applications

Andrew Ng and Giskard team has recently released great course called "Red Teaming LLM Applications" on DeepLearning.AI platform. This course provides practical aspects on testing large language models and finding weaknesses and potentially harmful outputs in their applications.

I've followed the on-screen instructions to re-create their practical Jupyter notebooks and then adapted the code to run against Azure OpenAI service, as it has slightly different syntax in comparison to the original OpenAI endpoints.

Additionally, various references to llama-index classes were updated, to make the course's helper functions compatible with the latest llama-index v0.10.x.

Configuring solution environment

To use Azure OpenAI backend, assign the API endpoint name, key and version, along with the Azure OpenAI deployment names of GPT and Embedding models to AZURE_OPENAI_API_BASE, AZURE_OPENAI_API_KEY, AZURE_OPENAI_API_VERSION, AZURE_OPENAI_API_DEPLOY (for GPT) and AZURE_OPENAI_API_DEPLOY_EMBED (for Embedding) environment variables respectively.
Install the required Python packages, by using the pip command and the provided requirements.txt file.

pip install -r requirements.txt

Lesson 1: Overview of LLM Vulnerabilities

First lesson provides an overview of LLM vulnerabilities. It describes hypothetical scenarios, causes of observed behaviour and potential impact. Four main categories of described LLM vulnerabilities are:

Bias and stereotypes;
Sensitive information disclosure;
Service disruption;
Hallucinations.

Lesson 2: Red Teaming LLMs

Second lesson focuses on the aspects of LLM Red Teaming. It explores different techniques to bypass the model's safeguards:

Exploiting text completion;
Using biased prompts;
Direct prompt injection;
Grey box prompt attacks;
Advanced technique: prompt probing.

Lesson 3: Red Teaming at Scale

Third lesson is about automation approaches for the Prompt Injection attacks;

Manually defined injection techniques;
Using library of prompts;
Giskard's LLM scan.

Lesson 4: Red Teaming LLMs with LLMs

Fourth lesson is about the use of LLM to automate the Red Teaming process. Here you can find how to use custom scripting to automate generation of adversarial inputs and evaluation of the app's outputs. Then it's shown how the same process can be automated by using Giskard's Python library.

Lesson 5: A Full Red Teaming Assessment

Fifth lesson provides an example of a full Red Teaming assessment. It consists of 2 rounds:

Round one is about more general probing of the company bot, to search for any signs of vulnerabilities in various categories, e.g. toxicity and offensive content, off-topic content, excessive agency, sensitive information disclosure, etc. You can use either custom prompts or automate prompts generation with Giskard's Python library;
Round two is about exploiting specific functionality, e.g. prompt injection, to achieve a malicious goal. In this fictitious scenario, we persuade the bot to refund the order, even if it's not eligible any more.

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
helpers		helpers
L1_Overview_LLM_Vulnerabilities.ipynb		L1_Overview_LLM_Vulnerabilities.ipynb
L2_Red_Teaming_LLMs.ipynb		L2_Red_Teaming_LLMs.ipynb
L3_Red_Teaming_at_Scale.ipynb		L3_Red_Teaming_at_Scale.ipynb
L4_Red_Teaming_LLMs_with_LLMs.ipynb		L4_Red_Teaming_LLMs_with_LLMs.ipynb
L5_Red_Teaming_Full_Assessment.ipynb		L5_Red_Teaming_Full_Assessment.ipynb
LICENSE		LICENSE
README.md		README.md
prompts.csv		prompts.csv
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Red Teaming LLM Applications

Table of contents:

Configuring solution environment

Lesson 1: Overview of LLM Vulnerabilities

Lesson 2: Red Teaming LLMs

Lesson 3: Red Teaming at Scale

Lesson 4: Red Teaming LLMs with LLMs

Lesson 5: A Full Red Teaming Assessment

About

Releases

Packages

Languages

License

LazaUK/DeepLearningAI-Giskard-RedTeaming

Folders and files

Latest commit

History

Repository files navigation

Red Teaming LLM Applications

Table of contents:

Configuring solution environment

Lesson 1: Overview of LLM Vulnerabilities

Lesson 2: Red Teaming LLMs

Lesson 3: Red Teaming at Scale

Lesson 4: Red Teaming LLMs with LLMs

Lesson 5: A Full Red Teaming Assessment

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages