About Artifex

Artifex is an open-source Python library developed by us at Tanaos for:

Using small, pre-trained task-specific LLMs locally on CPU
Fine-tuning them on CPU without any training data — just based on your instructions for the task at hand.

The problem

LLMs available on the market can be broadly classified into two categories:

General-purpose LLMs (GPT, Claude, Llama, etc.) have two main limitations:
1. They are designed for open-ended tasks, which makes them overkill and often suboptimal for simpler, specific use cases.
2. If open-source, they require expensive GPUs for training and inference; if not open-source, they incur high costs for usage via APIs and have data privacy concerns since your data is sent to 3rd-party servers.
Smaller LLMs (DistilBERT, TinyBERT, etc.) can sometimes be trained and run locally on CPU, but they require large amounts of labeled training data to perform well on specific tasks — which is often not available.

Artifex overcomes these limitations by enabling you to:

Use small (capped at 500 MB in size), pre-trained task-specific LLMs locally on CPU, thereby eliminating costs and data privacy concerns.
Fine-tune these models based on your requirements, without any training data — just based on your instructions for the task at hand — thereby obtaining higher accuracy on your specific use case.

Details
How is it possible?
Artifex generates synthetic training data on-the-fly based on your instructions, and uses this data to fine-tune small LLMs for your specific task. This approach allows you to create effective models without the need for large labeled datasets.

Install Artifex via pip:

pip install artifex

We currently support the following tasks:

Text Classification: Performs general-purpose text classification based on the user requirements.
Guardrail: Flags unsafe, harmful, or off-topic messages.
Intent Classification: Classifies user messages into predefined intent categories.
Reranker: Ranks a list of items or search results based on relevance to a query.
Sentiment Analysis: Determines the sentiment (positive, negative, neutral) of a given text.
Emotion Detection: Identifies the emotion expressed in a given text.
Named Entity Recognition (NER): Detects and classifies named entities in text (e.g., persons, organizations, locations).
Text Anonymization: Removes personally identifiable information (PII) from text.
Spam Detection: Identifies whether a message is spam or not.
Topic Classification: Classifies text into predefined topics.

We will be adding more tasks soon, based on user feedback. Want Artifex to perform a specific task? Suggest one or vote one up.

For each task, Artifex provides three easy-to-use APIs:

Inference API to use a default, pre-trained small LLM to perform that task out-of-the-box locally on CPU.
Fine-tune (or train) API to fine-tune the default model based on your requirements, without any training data and on CPU. The fine-tuned model is generated on your machine and is yours to keep.
Load API to load your fine-tuned model locally on CPU, and use it for inference or further fine-tuning.