© 2012

Multilingual Information Retrieval

From Research To Practice


Table of contents

  1. Front Matter
    Pages i-xvii
  2. Carol Peters, Martin Braschler, Paul Clough
    Pages 1-16
  3. Carol Peters, Martin Braschler, Paul Clough
    Pages 17-55
  4. Carol Peters, Martin Braschler, Paul Clough
    Pages 57-84
  5. Carol Peters, Martin Braschler, Paul Clough
    Pages 85-128
  6. Carol Peters, Martin Braschler, Paul Clough
    Pages 129-169
  7. Carol Peters, Martin Braschler, Paul Clough
    Pages 171-207
  8. Back Matter
    Pages 209-217

About this book


We are living in a multilingual world and the diversity in languages which are used to interact with information access systems has generated a wide variety of challenges to be addressed by computer and information scientists. The growing amount of non-English information accessible globally and the increased worldwide exposure of enterprises also necessitates the adaptation of Information Retrieval (IR) methods to new, multilingual settings.

Peters, Braschler and Clough present a comprehensive description of the technologies involved in designing and developing systems for Multilingual Information Retrieval (MLIR). They provide readers with broad coverage of the various issues involved in creating systems to make accessible digitally stored materials regardless of the language(s) they are written in. Details on Cross-Language Information Retrieval (CLIR) are also covered that help readers to understand how to develop retrieval systems that cross language boundaries. Their work is divided into six chapters and accompanies the reader step-by-step through the various stages involved in building, using and evaluating MLIR systems. The book concludes with some examples of recent applications that utilise MLIR technologies. Some of the techniques described have recently started to appear in commercial search systems, while others have the potential to be part of future incarnations.

The book is intended for graduate students, scholars, and practitioners with a basic understanding of classical text retrieval methods. It offers guidelines and information on all aspects that need to be taken into consideration when building MLIR systems, while avoiding too many ‘hands-on details’ that could rapidly become obsolete. Thus it bridges the gap between the material covered by most of the classical IR textbooks and the novel requirements related to the acquisition and dissemination of information in whatever language it is stored.


Cross-Language Information Retrieval Digital Libraries HCI Human-Computer Interaction Multilingual Information Retrieval NLP Natural Language Processing

Authors and affiliations

  1. 1., Istituto di Scienza e TecnologieConsiglio Nazionale delle RicerchePisaItaly
  2. 2.of Applied Sciences, Institute of AppliedZurich UniversityWinterthurSwitzerland
  3. 3.Dept. Information StudiesUniversity of SheffieldSheffieldUnited Kingdom

About the authors

Carol Peters is a researcher at the Italian National Research Council’s “Istituto di Scienza e Tecnologie dell'Informazione.” Her main research activities concern the development of multilingual access mechanisms for digital libraries and evaluation methodologies for cross-language information retrieval systems. She was leader of the EU Sixth Framework project MultiMatch and coordinated the Cross-Language Evaluation Forum (CLEF) during its first ten years of activity. In 2009, in recognition of her work for CLEF, she was awarded the Tony Kent Strix Award.


Martin Braschler is a lecturer at the Zurich University of Applied Sciences in Winterthur. His main research interests are in the field of information retrieval evaluation, cross-language information retrieval, and natural language processing. He served as the technical coordinator of the cross-language track at the American TREC series of evaluation campaigns from 1997-1999, and was technical coordinator of the CLEF evaluation campaigns in Europe from their start in 2000 until 2004. Having previously served as head of research and innovation at Eurospider Information Technology AG, Zurich, Switzerland, a vendor of information retrieval solutions, until 2004, he has actively been involved in the transfer of state-of-the-art information retrieval technology to the commercial marketplace.


Paul Clough is a senior lecturer in the Information School at the University of Sheffield. He has worked on both technical and user-oriented aspects of information retrieval in areas that include Cross-Language IR (CLIR), Geographic IR (GIR), image retrieval and personalisation. An important area of his work has been on evaluating IR systems where he co-founded and helped co-ordinate the ImageCLEF evaluation campaign 2003-2010 and is currently involved in organising a TREC task on evaluating query sessions (the Session Track). He has been a Principal Investigator for Sheffield in 4 EU-funded projects (MultiMatch, Memoir, TrebleCLEF and PATHS), an AHRC-funded studentship on recommender systems and a project funded by the UK National Archives on improving information access.

Bibliographic information

Industry Sectors
IT & Software
Consumer Packaged Goods
Finance, Business & Banking


From the reviews:

“In this book the Reader may find a lot of information from experience of running such large evaluation campaigns. … The book contents comprise six chapters that follow a conference paper structure. … I recommend it to academia as a resource providing background knowledge in multilingual information retrieval.” (Jolanta Mizera-Pietraszko, Informer, November, 2012)

"A valuable and  comprehensive handbook. The book is containing pointers to many useful resources"
Prasenjit Majumder, DAIICT, India