Florence-2 Caption Dataset Creator

[Florence-2 Flux Large] [Florence-2 Flux Base] [Florence-2 More Detailed] [MiaoshouAI PromptGen v2.0]

Input Picture

Model

Generated Caption

Instructions

Preview Caption

Upload a single image and generate a detailed caption
Try different models to compare results

Create Dataset

Upload multiple images to process them all at once
All images will be captioned and saved with matching .txt files
By default, captions include [trigger] at the beginning (you can modify the trigger word)
Click "Process Images" to generate captions and create a downloadable dataset
Use the download button to get a ZIP file containing all images and caption files

Models Available

Florence-2-Flux: Faster version with good quality captions
Florence-2-Flux-Large: Provides detailed captions with better image understanding
florence-2-large-ft-moredetailed: Fine-tuned specifically for more detailed captions
Florence-2-large-PromptGen-v2.0: Memory efficient model with high quality detailed captions

MiaoshouAI/Florence-2-large-PromptGen-v2.0 Features

Improved caption quality for detailed captions
Memory efficient (requires only ~1GB VRAM)
Fast generation while maintaining high quality
Supports multiple caption formats including detailed captions, tags, and analysis

Supported image formats: JPG, JPEG, PNG