🥸 Text Anonymization model
Use the default Text Anonymization model
Need a general-purpose Text Anonymization model? You can use Artifex's default Text Anonymization model, which is trained to recognize and remove five Personal Identifiable Information (PII) types out-of-the-box:
PERSONLOCATIONDATEADDRESSPHONE_NUMBER
from artifex import Artifex
ta = Artifex().text_anonymization
print(ta("John Doe lives at 123 Main St, New York. His phone number is (555) 123-4567."))
# >>> ["[MASKED] lives at [MASKED]. His phone number is [MASKED]."]
Learn more about the default Text Anonymization model on our Text Anonymization HF model page.
Create & use a custom Text Anonymization model
Do you want to tailor the model to your specific domain for better results? Fine-tune your own Text Anonymization model, use it locally on CPU and keep it forever:
from artifex import Artifex
ta = Artifex().text_anonymization
model_output_path = "./output_model/"
ta.train(
domain="medical documents", # change to your desired domain
output_path=model_output_path
)
ta.load(model_output_path)
print(ta("The patient John Doe visited New York on 12th March 2023 at 10:30 AM."))
# >>> ["The patient [MASKED] visited [MASKED] on [MASKED] at [MASKED]."]