MirFlickr Near-Duplicate Images Home Page

This page contains the data files deriving from the MirFlickr Near-Duplicate research performed at the Universities of Strathclyde, Scotland, and the Politecnico di Torino, Italy.

The origin of these files is fully explained in the following publications:

Publication Authors DOI
(1) Identification of MIR-Flickr Near-duplicate Images - A Benchmark Collection for Near-duplicate Detection Richard Connor, Stewart MacKenzie-Leigh, Franco Alberto Cardillo and Robert Moss 10.5220/0005359705650571
(2) Quantifying the Specificity of Near-duplicate Image Classification Functions Richard Connor and Franco Alberto Cardillo 10.5220/0005785406470654
(3) Benchmarking unsupervised near-duplicate image detection Lia Morra and Fabrizio Lamberti 10.1016/j.eswa.2019.05.002

Here are the data files:

origin Description file link view images
Strathclyde A list of clusters of identical images; that is, the same pixel values in the same locations. File 1
Torino A text file containing, on each line, a cluster of IND near-duplicate images. See Paper (3) for detailed definition File 2
Torino A text file containing, on each line, a cluster of NIND near-duplicate images. See Paper (3) for detailed definition. File 3
Strathclyde A text file containing, on each line, a cluster of IND near-duplicate images, defined as those which appear to have been derived from a common precursor via manipulation. There are 1958 IND clusters resulting from from 2320 judged pairs, these including 4071 images. The mean cluster size is 2.08. The total of implied number of IND pairs is 2407. The largest cluster size is 14. IND is an equivalence relation and so these clusters form a partition of the data. File 4
Strathclyde A text file containing clustered NIND images. NIND is not an equivalence relation in general, but we have clustered as if it was, which seems to work. Feel free to disagree! There are 379 clusters, the largest has 59 images, the mean cluster size is 2.31. File 5

(there are a small number of anomalies in these files, we are just about to fix them...!)