Multimedia Knowledge and Social Media Analytics Laboratory

EU Elections 2014 Prediction dataset

Summary
This is the dataset used in (Tsakalidis et al., 2015) on predicting election results using twitter and polls. The dataset contains the ids of the tweets and the poll data used to build the prediction models. You can download the dataset here.
 
Citation
If you use this dataset in your research, please cite the following article:
A. Tsakalidis, S. Papadopoulos, A.I. Cristea, I. Kompatsiaris. "Predicting Elections for Multiple Countries Using Twitter and Polls". IEEE Intelligent Systems (to appear in 2015)
 
Description
The zip file has the following structure:
 
Folder tweet_ids:
keywords.txt contains the keywords used as inputs tot the Twitter Streaming API to perform the collection. The three csv files contain the ids of the collected tweets.
 
Folder features_polls:
The "polls.ods" file provides the information about the polls that we used during our processing (one sheet per country).
 
Every arff file contains the features that we extracted for every country on a daily basis. The attribute corresponding to the poll-based value of every party is indicated by the name of the party. The Twitter-based features are provided in the form of "feature_i", where "feature" corresponds to the Twitter-based feature and "i" is a pointer. 
The mapping of parties to this index "i" is the following:
 
Germany ("de.arff") 
1: CDU/CSU
2: SPD 
3: Linke 
4: Grunen 
5: FDP 
6: AfD
 
The Netherlands ("nl.arff")  
1: PVV 
2: VDD 
3: D66 
4: CDA
5: PvdA
6: SP 
7: CU  (Notice that since "CU/SGP" was a coalition, the features corresponding to both of these indices --7 and 8-- were used as an input for the prediction of "CU/SGP" voting share)
8: SGP (Notice that since "CU/SGP" was a coalition, the features corresponding to both of these indices --7 and 8-- were used as an input for the prediction of "CU/SGP" voting share)
9: GL
10:50PLUS
11:PvdD
 
Greece ("gr.arff")
1: ND 
2: SYRIZA
3: Potami
4: XA 
5: Elia 
6: KKE 
7: ANEL 
8: DIMAR
 
The "lefko" attribute in the case of Greece corresponds to the "blank voters" of the polls and was only presented in the case of Greece. It was not used at all in any experiment (neither pre- nor post-electoral).