- Clone the repository: git clone https://github.com/GenLLMGuard/BackdoorDetection.git
- Navigate to the directory: cd BackdoorDetection
- Install dependencies (optional): pip install -r requirements.txt
- Import the function: from Sentence_invert import run_sentence_invert
- Run the function: best_sentence, intermediate_sents, best_sent_norm_attention_wghts = run_sentence_invert(model, tokenizer, user_prompt='User:')
-
Notifications
You must be signed in to change notification settings - Fork 0
This repository contains the code for the paper titled "GenLLMGuard: Detecting Backdoors in LLMs for Open-Ended Text Generation Through Trigger Inversion".
License
GenLLMGuard/BackdoorDetection
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
This repository contains the code for the paper titled "GenLLMGuard: Detecting Backdoors in LLMs for Open-Ended Text Generation Through Trigger Inversion".