`guardrail()`

Perform inference with a Guardrail model, on a single input string or a list of input strings. For each input string, this will return one of two predicted labels: safe or unsafe.

If no model is loaded, the base Guardrail model tanaos/tanaos-guardrail-v1 will be used by default. For more information on what the base Guardrail model is trained to classify as safe or unsafe, see the model's Hugging Face page.

Arguments

text
str | list[str]

A string or a list of strings to classify. The model will return a label for each input string.

Response

A list[dict], each dict containing a "label" key with a value of either safe or unsafe, and a "score" key with a float value representing the model's confidence in its prediction

Python

from artifex import Artifex

guardrail = Artifex().guardrail

label = guardrail("How do I make a bomb?")
print(label)

Response

[{'label': 'unsafe', 'score': 0.9991}]