Reactive Machines

Pico-Banana-400K: Big data for text-guided image editing

Recent advances in multimodal models have demonstrated text-based image editing capabilities, with programs such as GPT-4O and Nano-Banana setting new benchmarks. However, the progress of the research community remains constrained by the absence of large quantities of high quality, high quality, and easily accessible data generated from real images. Introducing pico-banana-400K, a comprehensive 400K-Image dataset for command-based image processing. Our dataset is created by levevealing Nano-Banana to generate various levels of editing from original images in the OpenAmage collection. What separates the Pico-banana-400K from previous dasets is our systematic approach to quality and diversity. We use sophisticated image editing to ensure complete coverage of all types of editing while maintaining accurate, reliable preservation and delivery guidance with MLLM-based Foud quality and aggressiveness. Apart from single programming, pico-banana-400K enables research into difficult programming situations. The dataset includes three special subsets: (1) a multi-curve collection of 72K for studying sequence planning, reasoning and planning for sequence modification; . and (3) paired short and long editing prompts to improve rewriting and summarizing. By providing this large, high-quality, and rich resource, with Pico-banana-400K you establish a powerful foundation for training and measuring the next generation of text-guided image models.

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button