Vision Datasets Scripts and logic to create high quality pre-training and finetuning datasets for multi-modal models! License MIT