Improving Diversity in Image Search via Supervised Relevance Scoring (Dataset)



Results returned by commercial image search engines should include relevant and diversified depictions of queries in order to ensure good coverage of users' information needs. While relevance has drastically improved in recent years, diversity is still an open problem. In this paper we propose a reranking method that could be implemented on top of such engines in order to provide a better balance between relevance and diversity. Our method formulates the reranking problem as an optimization of a utility function that jointly considers relevance and diversity. Our main contribution is the replacement of the unsupervised definition of relevance that is commonly used in this formulation with a supervised classification model that strives to capture a query and application-specific notion of relevance. This model provides more accurate relevance scores that lead to significantly improved diversification performance. Furthermore, we propose a stacking-type ensemble learning approach that allows combining multiple features in a principled way when computing the relevance of an image. An empirical evaluation carried out on the datasets of the MediaEval 2013 and 2014 "Retrieving Diverse Social Images" (RDSI) benchmarks confirms the superior performance of the proposed method compared to other participating systems as well as a state-of-the-art, unsupervised reranking method.


We make available (here) the version of the MediaEval 2014 Retrieving Diverse Social Images dataset that we used in our experiments. In particular, we provide the following data, separately for the development and the test set of the collection:

The data can be used to reproduce the experimental results presented in the paper as well as to try new methodologies based on the provided image features and relevance scores.


If you use the dataset in your research, please cite the following paper:

Eleftherios Spyromitros-Xioufis, Symeon Papadopoulos, Alexandru Lucian Ginsca, Adrian Popescu, Yiannis Kompatsiaris, and Ioannis Vlahavas. 2015. Improving Diversity in Image Search via Supervised Relevance Scoring. In Proceedings of the 5th ACM on International Conference on Multimedia Retrieval (ICMR '15). ACM, New York, NY, USA, 323-330. DOI:


In case you have any questions about the dataset you can contact: