Dataset Labeler generates training-ready captions for every image in your dataset — optimized per model for LoRA, Dreambooth, and fine-tuning workflows. No cloud. No API keys. No cost.
Writing them manually takes days. Cloud tools cost money and send your images to external servers. Dataset Labeler solves both — and auto-selects the right caption style for your target model.
Each image generation model was trained on a different caption format. Select your target model and Dataset Labeler automatically applies the optimal style, token limit and prompt structure.
The launcher handles the entire setup for you. No Terminal, no manual configuration.
spidermanxyz_001.png) and perfectly paired with their matching .txt caption files — ready to drop straight into Kohya or SimpleTuner.Every feature is designed around one goal — making your training captions as accurate, detailed and model-compatible as possible.
spidermanxyz) and every caption will start with it instead of generic dictionary words. Completely prevents model concept bleeding.IMG_9482.png. The app automatically sequences your entire dataset (e.g., trigger_001.png) and packs it into a ready-to-train ZIP.We’re constantly refining the captioning engine based on the latest AI training research.
Most AI captioning tools send your images to external servers. Dataset Labeler never does.
Free download. macOS only. No setup beyond clicking Launch.
Everything you need to get Dataset Labeler running smoothly on your Mac.
ollama run qwen2.5vl:7b. This will finish the download with a detailed progress bar. Once it's done, restart Dataset Labeler.
We believe in local AI. That means no cloud, no tracking, and no hidden fees. But it also means navigating Apple's "walled garden." Here is exactly what is happening under the hood.
Yes. Dataset Labeler is built with Electron and Python—standard tools used by millions of developers. You can even check the local network traffic; the app makes zero outgoing requests to the internet after the initial model download.
Everything you need to know to start creating your own custom AI models.
Model training is the process of teaching an AI to recognize specific concepts. By showing the AI a dataset of images paired with descriptive captions, the model learns the relationship between words and pixels. Dataset Labeler automates the hardest part: writing the captions.
LoRA (Low-Rank Adaptation) is a lightweight way to "fine-tune" a giant model like FLUX or SDXL. Instead of retraining the whole model (which is expensive), you train a tiny "plug-in" file (usually 50MB–300MB) that contains a specific person, style, or object.
To create your own custom AI, you need four key ingredients:
If this tool saved you hours of tagging or helped you train the perfect LoRA, consider buying me a coffee to keep the updates coming.