Multimedia Knowledge and Social Media Analytics Laboratory

Social Event Detection 2012 (SED 2012) dataset

Overview

This page makes available for research purposes the dataset, challenge definitions, ground truth challenge results and corresponding evaluation script that were created and used in the 2012 edition of the Social Event Detection (SED) task of the MediaEval international benchmarking activity.

The Social Event Detection (SED) task of MediaEval 2012 requires participants to discover social events and detect related media items in a collection of images that are accompanied by metadata typically found on the social web (including time-stamps, tags, geotags for a small subset of them). By social events, we mean that the events are planned by people, attended by people and that the media illustrating the events are captured by people. Finding the events, in this task, means finding a set of photo clusters, each cluster comprising only photos associated with a single event (thus, each cluster defining a retrieved event).

For more information on the SED 2012 dataset, challenges and evaluation, please see the following publication:

S. Papadopoulos, E. Schinas, V. Mezaris, R. Troncy, I. Kompatsiaris, "Social Event Detection at MediaEval 2012: Challenges, Dataset and Evaluation", Proc. MediaEval 2012 Workshop, Pisa, Italy, October 2012.

Download links:

The data being released through this page include:

1)The SED 2012 test kit (which includes the definitions of the three SED 2012 challenges and the XML file with the image metadata that could be used for addressing these challenges).

- sed2012_test_kit.zip (~15MB)

2) The images of the collection (167,332 images that were captured between the beginning of 2009 and end of 2011 by 4,422 unique Flickr users, and were posted to Flickr by their respective owners under a creative commons license). The images are made available in the form of 4 compressed image archives, plus an image license file, as follows:

- sed2012_photos_part1.tar.gz (~1.2GB)

- sed2012_photos_part2.tar.gz (~1.6GB)

- sed2012_photos_part3.tar.gz (~2.2GB)

- sed2012_photos_part4.tar.gz (~0.8GB)

- sed2012_photos_license.zip (~4MB)

3) The ground truth results for the three defined challenges (queries) on the provided dataset, together with a script for evaluating any social event detection results against this ground truth.

- sed2012_evaluation_kit.zip (~1MB)

Note that for the evaluation script to work you need to name your submission file in the format "sed_Cx_group_Ny.csv":

  • sed is a pre-amble
  • Cx refers to the Challenge number (could be 1, 2 or 3)
  • group is the name of your group/lab (could be any alphanumeric string)
  • Ny refers to the submission number (for the challenge participation it was 1 to 5)
  • csv is the mandatory extension.

 

Copyright notice:

The images distributed as part of the Social Event Detection 2012 (SED 2012) dataset were collected from Flickr, where they were posted by their respective owners under a Creative Commons license. The Creative Commons attribution licenses allow for image use as long as the photographer is credited for the original creation. Possibly, use is granted under additional restrictions, but none of these preclude the use of the images for benchmarking purposes.

While compiling the Social Event Detection 2012 (SED 2012) dataset, we collected only Creative Commons images, and also collected as much information possible about the creators of each image. The creator information, the exact license type and other relevant information are included in the image license file, which is distributed together with the images.

We would like to take this opportunity to express our gratitude to the image photographers for allowing us to use their pictures: we greatly appreciate this and gladly acknowledge your work. Your names and license details are listed in image license file. Please let us know if you have special wishes on how you would like to be credited or have additional details that must be incorporated.

Related publications:

If you use the Social Event Detection 2012 (SED 2012) dataset in your research work, please cite the following paper:

S. Papadopoulos, E. Schinas, V. Mezaris, R. Troncy, I. Kompatsiaris, "Social Event Detection at MediaEval 2012: Challenges, Dataset and Evaluation", Proc. MediaEval 2012 Workshop, Pisa, Italy, October 2012.