A Large-scale Dataset for Sketch-based Image Retrieval Evaluation
Introduction
Our dataset consists of three parts: a user sketch set, an object image set and a distraction image set.
(1) To obtain sketches, we asked five subjects to draw sketches arbitrarily. We allowed them to check each other’s sketches and removed the sketches that could not be recognized, after which we had about 120 sketches in total. Then we manually selected 80 sketches that were suitable for SBIR.
(2) To obtain object images, we used the object category as the keyword for each sketch and searched on Flickr and Google. 20 photo-realistic images were collected, giving 80×20=1600 object images in total. To make our dataset challenging, we selected a large number of images where the object contours were mixed with noisy edges.
(3) To obtain distracting images, we defined 100 keywords which were irrelevant to the selected object categories. We then searched these keywords and downloaded three million images from Flickr. Images that contain the selected objects were removed.
Examples of User Sketches
Usage
Mix the object image set and distraction image set, then use sketch set as input to search the mixed set.
Downloads
Flickr_3M