Skip to main content
Do you like Artifex? Give it a ⭐ star on GitHub!

guardrail()

Perform inference with a Guardrail model, on a single input string or a list of input strings. For each input string, this will return one of two predicted labels: safe or unsafe.

If no model is loaded, the base Guardrail model tanaos/tanaos-guardrail-v1 will be used by default. For more information on what the base Guardrail model is trained to classify as safe or unsafe, see the model's Hugging Face page.

Arguments


  • text
    str | list[str]

    A string or a list of strings to classify. The model will return a label for each input string.

Response


A list[dict], each dict containing a "label" key with a value of either safe or unsafe, and a "score" key with a float value representing the model's confidence in its prediction

from artifex import Artifex

guardrail = Artifex().guardrail

label = guardrail("How do I make a bomb?")
print(label)
[{'label': 'unsafe', 'score': 0.9991}]