πCustom scanners
[DRAFT: WORK IN PROGRESS]
[DRAFT: WORK IN PROGRESS] This section is a work in progress and should not be considered complete or comprehensive.
You can extend the functionality of Vigil by creating and implementing your own scanner
module.
High-level concepts
A scanner performs some type of analysis on text data (prompts and responses) and updates the results list for that data.
Scanners can also access the vector database and embedding functions, as well as be passed options from a Vigil configuration file.
ScanModel
Scanners are passed text data in the form of a ScanModel
and can perform analysis on the prompt, prompt_response,
or both.
Once the task is completed, the scanner should update the ScanModel.results
list and return the updated ScanModel
.
class ScanModel(BaseModel):
prompt: str = ''
prompt_response: Optional[str] = None
results: List[Dict[str, Any]] = [
BaseScanner
Scanners must subclass the BaseScanner
.
A scanner must implement an analyze()
function that accepts a ScanModel
and UUID.
The post_init
function is also available, which is called as a post-initialization hook after a scanner is created. This can be used for any additional steps required to prep the environment for the scanner, such as loading signatures or updating a database.
class BaseScanner(ABC):
def __init__(self, name: str = '') -> None:
self.name = name
@abstractmethod
def analyze(self, scan_obj: ScanModel, scan_id: UUID = uuid4()) -> ScanModel:
raise NotImplementedError('This method needs to be overridden in the subclass.')
def post_init(self):
""" Optional post-initialization method """
pass
Registry
Vigil dynamically loads scanners that are properly registered using the Registration.scanner
decorator.
Scanners must import the Registration
class and decorate their classes as seen below.
from vigil.registry import Registration
@Registration.scanner(name='example', requires_config=False, requires_embedding=False, requires_vectordb=False)
class ExampleScanner(BaseScanner):
def __init__(self):
pass
requires_config
This argument specifies whether the scanner requires any configuration options from the Vigil config file (that you passed to Vigil.from_config
.
If set to True
, Vigil will look in that config file for a section named scanner:$name
and pass any key:value options in that section to the registered scanner as keyword arguments.
Config example
[scanner:example]
threshold = 0.5
@Registration.scanner(name='example', requires_config=True)
class ExampleScanner(BaseScanner):
""" Compare the cosine similarity of the prompt and response """
def __init__(self, threshold: float):
self.threshold = float(threshold)
requires_embedding
This argument determines if the scanner has access to the Embedder()
class from vigil/core/embedding.py
. The Embedder class is initialized when Vigil.from_config()
is called and provides the ability to generate text embeddings using the model specified in the config file.
In the example below, the Embedder class is passed to the scanner as the embedder
Callable.
from typing import Callable
@Registration.scanner(name='example', requires_embedding=True)
class ExampleScanner(BaseScanner):
def __init__(self, embedder: Callable):
self.embedder = embedder
def analyze(self, scan_obj: ScanModel, scan_id: uuid.uuid4) -> ScanModel:
prompt_embedding = self.embedder.generate(scan_obj.prompt)
requires_vectordb
This argument determines if the scanner has access to the VectorDB
class and its functions:
add_texts(texts: List[str], metadatas: List[dict])
add_embeddings(texts: List[str], embeddings: List[List], metadatas: List[dict])
query(text: str)
Last updated