Category: Uncategorized (page 4 of 4)

Dr. Carlos Castillo

carlos_castillo

Dr. Carlos Castillo is a web miner with a background on information retrieval, and has been influential in the areas of adversarial web search and web content quality and credibility. He is a prolific researcher with more than 60 publications in top-tier international conferences and journals, including a book on Information and Influence Propagation, a monograph on Adversarial Web Search, and 6000+ citations. His current research focuses in the application of web mining methods to problems in the domain of on-line news and humanitarian crises.
Carlos received his Ph.D from the University of Chile (2004), and was a visiting scientist at Universitat Pompeu Fabra (2005) and Sapienza Universitá di Roma (2006) before working as a scientist and senior scientist at Yahoo! Research (2006-2012), and as a senior scientist and principal scientist at Qatar Foundation’s Computing Research Institute (2012-present). He has served in the PC or SPC of all major conferences in his area (WWW, WSDM, SIGIR, KDD, CIKM, etc.). He was Program Committee Co-chair of WSDM 2014, and co-organized the Adversarial Information Retrieval Workshop and Web Spam Challenge in 2007 and 2008, the ECML/PKDD Discovery Challenge in 2010 and 2014, the Web Quality Workshop from 2011 to 2014, and the Social Web for Disaster Management Workshop in 2015. He is an ACM Senior Member and an IEEE Senior Member.

Lecture: Social Media Mining and Retrieval

Text Classification, Sentiment Analysis & Opinion Mining

Lecturer: Dr. Fabrizio Sebastiani

Text Classification (TC) is a basic enabling technology in nowadays’ IR, since many text-related prediction tasks can be framed in terms of classification. As a result, scores of applications (ranging from webpage/website classification under folksonomies to author identification for texts of uncertain paternity) have a TC engine under the hood. Modern text classification methods rely on supervised machine learning; according to this paradigm, a general-purpose learning algorithms learns the characteristics a text should have in order to be classified under class X, by analysing a set of texts which were previously classified as belonging or not belonging to X by a human. This tutorial will discuss the main steps towards the construction of a text classifier, from the generation of vectorial representations of the texts, to training a classifier from examples, to evaluating its accuracy on benchmark datasets.

Until 15 years ago, text classification was almost a synonym of “classification by topic”, i.e., classifying textual documents according to what they are about. More recently, the classification of texts according to dimensions other than topic (e.g., by language, as in language identification; by author, as in authorship attribution) has also been investigated. The most important among these dimensions is certainly sentiment, as when classifying a product review according to whether it expresses a positive or a negative opinion towards the topic. Sentiment classification is an instance of a more general task called “opinion mining”, which encompasses all tasks having to do with the analysis of text according to the sentiments and opinions expressed therein. The key difference between classification by topic and classification by sentiment lies in the way vectorial representations of the texts. This tutorial will explore these key differences by discussing the text representation techniques adopted in state-of-the-art sentiment classification systems, with particular emphasis on systems that tackle text arising within social media.

DR. FABRIZIO SEBASTIANI

Fabrizio_Sebastiani2

Fabrizio Sebastiani has been a Principal Scientist at the Qatar Computing Research Institute since July 2014; from March 2006 to June 2014 he has been a Senior Researcher at Istituto di Scienza e Tecnologie dell’Informazione, Consiglio Nazionale delle Ricerche, Italy, from which he is currently on leave; before February 2006 he was an Associate Professor at the Department of Pure and Applied Mathematics of the University of Padova, Italy. His main current research interests are at the intersection of information retrieval, machine learning, and human language technologies, with particular emphasis on text classification, information extraction, opinion mining, and their applications. He is a Senior Associate Editor for ACM Transactions on Information Systems (ACM Press), an Associate Editor for AI Communications (IOS Press), and a member of the Editorial Boards of Information Retrieval (Kluwer) and Foundations and Trends in Information Retrieval (Now Publishers); of the latter he is also a Foundign Editor and past co-Editor-in-Chief. He is also a past member of the Editorial Boards of the Journal of the American Society for Information Science and Technology (Wiley), Information Processing and Management (Elsevier), and ACM Computing Reviews (ACM Press). He is the Editor for Europe, Middle East, and Africa, of Springer’s “Information Retrieval” book series. He has been the General Chair of ECIR 2003 and SPIRE 2011, and a Program co-Chair of SIGIR 2008 and ECDL 2010; he is the appointed General co-Chair of SIGIR 2016. From 2003 to 2007 he has been the Vice-Chair of ACM SIGIR. He has given several tutorials at international conferences (among which ECDL 1997, ECDL 1998, ER 1998, WWW 1999, ECDL 2000, COLING 2000, IJCAI 2001, ECDL 2001, ECIR 2014, EMNLP 2014, SAC 2015) and courses at summer schools (among which ESSLLI 2003, ESSIR 2005) on themes at the intersection of machine learning and information retrieval.

Lecture: Text Classification, Sentiment Analysis & Opinion Mining