๐Ÿ”ฌ
Vigil: Documentation
GitHub Repo
  • ๐Ÿ›ก๏ธVigil
  • Overview
    • ๐Ÿ—๏ธRelease Blog
    • ๐Ÿ› ๏ธInstall Vigil
      • ๐Ÿ”ฅInstall PyTorch (optional)
    • ๐ŸงชUse Vigil
      • โš™๏ธConfiguration
        • ๐Ÿ”„Auto-updating vector database
      • ๐Ÿ—„๏ธLoad Datasets
      • ๐ŸŒWeb server
        • ๐ŸคAPI Endpoints
        • ๐ŸชWeb UI playground
      • ๐ŸPython library
      • ๐ŸŽฏScanners
        • ๐Ÿค—Transformer
        • โ•YARA / Heuristics
        • ๐Ÿ“‘Prompt-response Similarity
        • ๐Ÿ’พVector database
        • ๐ŸคCanary Tokens
    • ๐Ÿ›ก๏ธCustomize Detections
      • ๐ŸŒŸAdd custom YARA signatures
      • ๐Ÿ”ขAdd embeddings
      • ๐ŸCustom scanners
    • ๐Ÿช„Sample scan results
Powered by GitBook
On this page
  1. Overview
  2. Use Vigil

Load Datasets

Load embedding datasets into Chroma

PreviousAuto-updating vector databaseNextWeb server

Last updated 1 year ago

If you don't intend to use the vector database scanner, you can skip this step.

Embeddings are currently available with three models, or you can bring your own dataset.

  • text-embedding-ada-002

  • all-MiniLM-L6-v2

  • all-mpnet-base-v2

If there is a model you'd like to see added, feel free to .

Repo
Repo
Repo

Run loader

Load the appropriate datasets for your embedding model with the loader.py utility.

Example: OpenAI datasets

python loader.py --conf conf/server.conf --dataset deadbits/vigil-instruction-bypass-ada-002
python loader.py --conf conf/server.conf --dataset deadbits/vigil-jailbreak-ada-002

You can also load your own datasets from as long as you use the columns:

Column
Type

text

string

embeddings

list[float]

model

string

๐Ÿงช
๐Ÿ—„๏ธ
open a Github Issue
Hugging Face Hub
vigil-instruction-bypass-ada-002
vigil-jailbreaks-ada-002
vigil-instruction-bypass-all-MiniLM-L6-v2
vigil-jailbreaks-all-MiniLM-L6-v2
vigil-instruction-bypass-all-mpnet-base-v2
vigil-jailbreak-all-mpnet-base-v2