REALEDIT: Reddit Edits As a Large-scale Empirical Dataset for Image Transformations

Abstract

Existing image editing models struggle to meet real-world demands; despite excelling in academic benchmarks, we are yet to see them adopted to solve real user needs. The datasets that power these models use artificial edits, lacking the scale and ecological validity necessary to address the true diversity of user requests. In response, we introduce REALEDIT, a large-scale image editing dataset with authentic user requests and human-made edits sourced from Reddit.

REALEDIT contains a test set of 9.3K examples the community can use to evaluate models on real user requests. Our results show that existing models fall short on these tasks, implying a need for realistic training data. So, we introduce 48K training examples, with which we train our REALEDIT model. Our model achieves substantial gains—outperforming competitors by up to 165 Elo points in human judgment and 92% relative improvement on the automated VIEScore metric on our test set.

We deploy our model back on Reddit, testing it on new requests, and receive positive feedback. Beyond image editing, we explore REALEDIT's potential in detecting edited images by partnering with a deepfake detection non-profit. Finetuning their model on REALEDIT data improves its F1-score by 14 percentage points, underscoring the dataset's value for broad, impactful applications.

Motivation

Despite significant advances in image editing models, they often fail to address real-world user needs. Current datasets lack the scale and ecological validity necessary to capture the true diversity of user requests, as they primarily rely on artificial edits rather than authentic user interactions.

Comparison of baseline and our model on photograph restoration

Dataset

We introduce REALEDIT, a large-scale dataset comprising authentic image editing requests and their corresponding human-made edits, sourced from two prominent Reddit communities: r/estoration and r/PhotoshopRequest. This dataset captures real-world editing scenarios and user preferences.

Results

We train our REALEDIT model on the collected dataset and evaluate it against state-of-the-art baselines using a comprehensive set of metrics.

Our model achieves substantial improvements across multiple evaluation metrics, including a 92% relative improvement on the VIEScore metric and outperforming competitors by up to 165 Elo points in human judgment.

Qualitative Results

Our model demonstrates robust performance across diverse real-world editing scenarios, from photo restoration to complex object manipulation. The results showcase the effectiveness of training on authentic user requests and human-made edits.

Example 1 of our model's image editing capabilities

Example 2 of our model's image editing capabilities

BibTeX

@article{sushko2024realedit, title = {REALEDIT: Reddit Edits As a Large-scale Empirical Dataset for Image Transformations}, author = {Sushko, Petr and Bharadwaj, Ayana and Lim, Zhi Yang and Ilin, Vasily and Caffee, Ben and Chen, Dongping and Salehi, Mohammadreza and Hsieh, Cheng-Yu and Krishna, Ranjay}, journal = {arXiv preprint}, year = {2024}, }