Multilingual summarisation

Automatic text summarization is a crucial add-on in the context of Information Retrieval since it allows the user to quickly grasp potentially large amounts of retrieved material and thus assess its relevance to their needs. Especially if the material stems from multilingual sources, its summarization is an asset.

The lecture aims to give an overview of the state of the art in summarization, with a special focus on multilingual techniques. In the first part of the lecture, we will introduce the traditional distinction between extractive and abstractive summarization and present in some depth modern approaches in both paradigms. Both single document and multiple document summarization will be considered. In the second part, multilingual summarization will be addressed. First, we will elaborate on language-independent techniques that are used in the state of the art for single respectively multiple document summarization. Then, we will discuss how we can obtain summaries in the language of the preference of the user from multilingual material. The third part will be dedicated to the presentation of the evaluation measures used to assess the quality of summarization techniques. To conclude, we will discuss how summaries can be taken advantage of in IR itself.

Lecturer: Leo Wanner

Prof. Arjen P. de Vries

Leo Wanner

teomrd