๐Ÿ”ฌ
Vigil: Documentation
GitHub Repo
  • ๐Ÿ›ก๏ธVigil
  • Overview
    • ๐Ÿ—๏ธRelease Blog
    • ๐Ÿ› ๏ธInstall Vigil
      • ๐Ÿ”ฅInstall PyTorch (optional)
    • ๐ŸงชUse Vigil
      • โš™๏ธConfiguration
        • ๐Ÿ”„Auto-updating vector database
      • ๐Ÿ—„๏ธLoad Datasets
      • ๐ŸŒWeb server
        • ๐ŸคAPI Endpoints
        • ๐ŸชWeb UI playground
      • ๐ŸPython library
      • ๐ŸŽฏScanners
        • ๐Ÿค—Transformer
        • โ•YARA / Heuristics
        • ๐Ÿ“‘Prompt-response Similarity
        • ๐Ÿ’พVector database
        • ๐ŸคCanary Tokens
    • ๐Ÿ›ก๏ธCustomize Detections
      • ๐ŸŒŸAdd custom YARA signatures
      • ๐Ÿ”ขAdd embeddings
      • ๐ŸCustom scanners
    • ๐Ÿช„Sample scan results
Powered by GitBook
On this page
  • High-level concepts
  • ScanModel
  • BaseScanner
  • Registry
  1. Overview
  2. Customize Detections

Custom scanners

[DRAFT: WORK IN PROGRESS]

[DRAFT: WORK IN PROGRESS] This section is a work in progress and should not be considered complete or comprehensive.

You can extend the functionality of Vigil by creating and implementing your own scanner module.

High-level concepts

A scanner performs some type of analysis on text data (prompts and responses) and updates the results list for that data.

Scanners can also access the vector database and embedding functions, as well as be passed options from a Vigil configuration file.

ScanModel

Scanners are passed text data in the form of a ScanModel and can perform analysis on the prompt, prompt_response, or both.

Once the task is completed, the scanner should update the ScanModel.results list and return the updated ScanModel.

class ScanModel(BaseModel):
    prompt: str = ''
    prompt_response: Optional[str] = None
    results: List[Dict[str, Any]] = [

BaseScanner

Scanners must subclass the BaseScanner.

A scanner must implement an analyze() function that accepts a ScanModel and UUID.

The post_init function is also available, which is called as a post-initialization hook after a scanner is created. This can be used for any additional steps required to prep the environment for the scanner, such as loading signatures or updating a database.

class BaseScanner(ABC):
    def __init__(self, name: str = '') -> None:
        self.name = name

    @abstractmethod
    def analyze(self, scan_obj: ScanModel, scan_id: UUID = uuid4()) -> ScanModel:
        raise NotImplementedError('This method needs to be overridden in the subclass.')

    def post_init(self):
        """ Optional post-initialization method """
        pass

The UUID represents the scan action within Vigil dispatch and can be used in log messages or other tracking.

Registry

Vigil dynamically loads scanners that are properly registered using the Registration.scanner decorator.

Scanners must import the Registration class and decorate their classes as seen below.

from vigil.registry import Registration

@Registration.scanner(name='example', requires_config=False, requires_embedding=False, requires_vectordb=False)
class ExampleScanner(BaseScanner):
    def __init__(self):
        pass

requires_config

This argument specifies whether the scanner requires any configuration options from the Vigil config file (that you passed to Vigil.from_config.

If set to True, Vigil will look in that config file for a section named scanner:$name and pass any key:value options in that section to the registered scanner as keyword arguments.

Config example

[scanner:example]
threshold = 0.5
@Registration.scanner(name='example', requires_config=True)
class ExampleScanner(BaseScanner):
    """ Compare the cosine similarity of the prompt and response """
    def __init__(self, threshold: float):
        self.threshold = float(threshold)

requires_embedding

In the example below, the Embedder class is passed to the scanner as the embedder Callable.

from typing import Callable

@Registration.scanner(name='example', requires_embedding=True)
class ExampleScanner(BaseScanner):
    def __init__(self, embedder: Callable):
        self.embedder = embedder

    def analyze(self, scan_obj: ScanModel, scan_id: uuid.uuid4) -> ScanModel:
        prompt_embedding = self.embedder.generate(scan_obj.prompt)

requires_vectordb

This argument determines if the scanner has access to the VectorDB class and its functions:

  • add_texts(texts: List[str], metadatas: List[dict])

  • add_embeddings(texts: List[str], embeddings: List[List], metadatas: List[dict])

  • query(text: str)

PreviousAdd embeddingsNextSample scan results

Last updated 1 year ago

This argument determines if the scanner has access to the Embedder() class from . The Embedder class is initialized when Vigil.from_config() is called and provides the ability to generate text embeddings using the model specified in the config file.

๐Ÿ›ก๏ธ
๐Ÿ
vigil/core/embedding.py