A Large-scale Dataset for Sketch-based Image Retrieval Evaluation

Introduction

Our dataset consists of three parts: a user sketch set, an object image set and a distraction image set.
(1) To obtain sketches, we asked five subjects to draw sketches arbitrarily. We allowed them to check each other’s sketches and removed the sketches that could not be recognized, after which we had about 120 sketches in total. Then we manually selected 80 sketches that were suitable for SBIR.
(2) To obtain object images, we used the object category as the keyword for each sketch and searched on Flickr and Google. 20 photo-realistic images were collected, giving 80×20=1600 object images in total. To make our dataset challenging, we selected a large number of images where the object contours were mixed with noisy edges.
(3) To obtain distracting images, we defined 100 keywords which were irrelevant to the selected object categories. We then searched these keywords and downloaded three million images from Flickr. Images that contain the selected objects were removed.

Examples of User Sketches

Usage

Mix the object image set and distraction image set, then use sketch set as input to search the mixed set.

Downloads

Flickr_3M