Active Learning

This toolbox facilitates the application of active learning in multimedia data.  Active learning is a machine learning variant that opts to overcome the problem of gathering a training corpus due to the need for manual annotation. In an effort to minimize the labeling effort, active learning proposes to train an initial classifier with a very small set of labeled examples and then expand the training set by selectively sampling new examples from a much larger set of unlabeled examples (also known as pool of candidates). These examples are selected based on their informativeness, i.e. how much they are expected to improve the classifier's performance. They are found in the uncertainty areas of the classifier and, in a typical case, are annotated upon request by an errorless oracle.

In the case of image classification, flickr offers an abundant set of user tagged images, which if used in the pool of candidates can automate the process of active learning. In this direction, we propose a method, SALIC (for details see the here), that uses tagged images as the pool of candidates and selects the images that are both of maximum informativess as well as accurately predicted for their content based on their tags. This toolbox also implements SALIC, in addition to the typical active learning method and a tag-based oracle.

 

Usage:

Output:

 

Installation:
  1. Clone the repository
  2. Download and compile LIBSVM for your architecture https://www.csie.ntu.edu.tw/~cjlin/libsvm/
  3. Download and compile the ConvNet Feature Computation Package from http://www.robots.ox.ac.uk/~vgg/software/deep_eval/
  4. Change the paths to the folders including the datasets in Wrapper.m, create the required files (img_Files.mat, tag_files.mat for each dataset) and run Wrapper.
 
Requirements:
  1. There is only compatability for Linux (the ConvNet Feature Computation Package is not compatible with windows). If a different CNN feature extraction library is used that runs on Windows, the code should run on Windows as well (not tested)
  2. For MIRFLICKR (1m images as the pool dataset), 64GB of RAM is minimum
  3. The code was tested using Matlab 2012a
 
Publication
 
If you use this code cite the following paper:
 
E. Chatzilari, S. Nikolopoulos, Y. Kompatsiaris and J. Kittler, "SALIC: Social Active Learning for Image Classification," in IEEE Transactions on Multimedia, vol. 18, no. 8, pp. 1488-1503, Aug. 2016.
doi: 10.1109/TMM.2016.2565440
 
URL: https://doi.org/10.1109/TMM.2016.2565440
 

License

Copyright 2016 Elisavet Chatzilari
 
   Licensed under the Apache License, Version 2.0 (the "License");
   you may not use this file except in compliance with the License.
   You may obtain a copy of the License at
 
       http://www.apache.org/licenses/LICENSE-2.0
 
   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
 
 
Acknowledgements
 
This work was supported by the European Community's Seventh Framework Programme (FP7) under Grant FP7-ICT-600676 “i-Treasures: Intangible Treasures—Capturing the Intangible Cultural Heritage and Learning the Rare Know-How of Living Human Treasures” and Grant FP7-601138 “PERICLES Digital Preservation.”
 
Contacts
 
You may contact Elisavet Chatzilari by sending e-mail at ehatzi@iti.gr for any question or remark you may have with respect to this tool.