Skip to main content
Do you like Artifex? Give it a ⭐ star on GitHub!

text_anonymization()

Perform inference with a Text Anonymization model, on a single input string or a list of input strings. For each input string, this will return an anonymized version of the string, in which all Personally Identifiable Information (PII) have been replaced with a placeholder.

If no model is loaded, the base Text Anonymization model tanaos/tanaos-text-anonymizer-v1 will be used by default. For more information on what Personal Identifiable Information (PII) the base Text Anonymization model is trained to extract, see the model's Hugging Face page.

Arguments


  • text
    str | list[str]

    A string or a list of strings to remove Personally Identifiable Information (PII) from. The model will return a version of the input text(s) in which all detected PII have been replaced with a placeholder.
  • entities_to_mask
    list[str]
    optional

    A list of entity types to mask in the input text(s). If not provided, all supported entity types will be masked. Supported entity types are "PERSON", "LOCATION", "DATE", "ADDRESS" and "PHONE_NUMBER".
  • mask_token
    str
    optional
    default: '[MASKED]'

    The placeholder string to use for masking detected PII. Default is "[MASKED]".

Response


A list[str], one per input text. Each string is a version of the corresponding input string in which all detected Personally Identifiable Information (PII) have been replaced with a placeholder.

from artifex import Artifex

ta = Artifex().text_anonymization

anonymized_text = ta("John Doe lives at 123 Main St, New York. His phone number is (555) 123-4567.")
print(anonymized_text)
["[MASKED] lives at [MASKED]. His phone number is [MASKED]."]