Skip to main content
  • Textbook
  • © 2013

Web Information Retrieval

  • Offers a unique combination of both traditional and Web-specific techniques of information retrieval
  • Includes novel applications like multi-domain search, semantic search, and crowd search
  • Classroom use is facilitated by a supplemental slide set
  • Includes supplementary material: sn.pub/extras

Part of the book series: Data-Centric Systems and Applications (DCSA)

Buy it now

Buying options

eBook USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book USD 79.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access

This is a preview of subscription content, log in via an institution to check for access.

Table of contents (15 chapters)

  1. Front Matter

    Pages I-XIV
  2. Principles of Information Retrieval

    1. Front Matter

      Pages 1-1
    2. An Introduction to Information Retrieval

      • Stefano Ceri, Alessandro Bozzon, Marco Brambilla, Emanuele Della Valle, Piero Fraternali, Silvia Quarteroni
      Pages 3-11
    3. The Information Retrieval Process

      • Stefano Ceri, Alessandro Bozzon, Marco Brambilla, Emanuele Della Valle, Piero Fraternali, Silvia Quarteroni
      Pages 13-26
    4. Information Retrieval Models

      • Stefano Ceri, Alessandro Bozzon, Marco Brambilla, Emanuele Della Valle, Piero Fraternali, Silvia Quarteroni
      Pages 27-37
    5. Classification and Clustering

      • Stefano Ceri, Alessandro Bozzon, Marco Brambilla, Emanuele Della Valle, Piero Fraternali, Silvia Quarteroni
      Pages 39-56
    6. Natural Language Processing for Search

      • Stefano Ceri, Alessandro Bozzon, Marco Brambilla, Emanuele Della Valle, Piero Fraternali, Silvia Quarteroni
      Pages 57-68
  3. Information Retrieval for the Web

    1. Front Matter

      Pages 69-69
    2. Search Engines

      • Stefano Ceri, Alessandro Bozzon, Marco Brambilla, Emanuele Della Valle, Piero Fraternali, Silvia Quarteroni
      Pages 71-90
    3. Link Analysis

      • Stefano Ceri, Alessandro Bozzon, Marco Brambilla, Emanuele Della Valle, Piero Fraternali, Silvia Quarteroni
      Pages 91-110
    4. Recommendation and Diversification for the Web

      • Stefano Ceri, Alessandro Bozzon, Marco Brambilla, Emanuele Della Valle, Piero Fraternali, Silvia Quarteroni
      Pages 111-120
    5. Advertising in Search

      • Stefano Ceri, Alessandro Bozzon, Marco Brambilla, Emanuele Della Valle, Piero Fraternali, Silvia Quarteroni
      Pages 121-133
  4. Advanced Aspects of Web Search

    1. Front Matter

      Pages 135-135
    2. Publishing Data on the Web

      • Stefano Ceri, Alessandro Bozzon, Marco Brambilla, Emanuele Della Valle, Piero Fraternali, Silvia Quarteroni
      Pages 137-159
    3. Meta-search and Multi-domain Search

      • Stefano Ceri, Alessandro Bozzon, Marco Brambilla, Emanuele Della Valle, Piero Fraternali, Silvia Quarteroni
      Pages 161-179
    4. Semantic Search

      • Stefano Ceri, Alessandro Bozzon, Marco Brambilla, Emanuele Della Valle, Piero Fraternali, Silvia Quarteroni
      Pages 181-206
    5. Multimedia Search

      • Stefano Ceri, Alessandro Bozzon, Marco Brambilla, Emanuele Della Valle, Piero Fraternali, Silvia Quarteroni
      Pages 207-221
    6. Search Process and Interfaces

      • Stefano Ceri, Alessandro Bozzon, Marco Brambilla, Emanuele Della Valle, Piero Fraternali, Silvia Quarteroni
      Pages 223-234
    7. Human Computation and Crowdsearching

      • Stefano Ceri, Alessandro Bozzon, Marco Brambilla, Emanuele Della Valle, Piero Fraternali, Silvia Quarteroni
      Pages 235-257
  5. Back Matter

    Pages 259-284

About this book

With the proliferation of huge amounts of (heterogeneous) data on the Web, the importance of information retrieval (IR) has grown considerably over the last few years. Big players in the computer industry, such as Google, Microsoft and Yahoo!, are the primary contributors of technology for fast access to Web-based information; and searching capabilities are now integrated into most information systems, ranging from business management software and customer relationship systems to social networks and mobile phone applications.

Ceri and his co-authors aim at taking their readers from the foundations of modern information retrieval to the most advanced challenges of Web IR. To this end, their book is divided into three parts. The first part addresses the principles of IR and provides a systematic and compact description of basic information retrieval techniques (including binary, vector space and probabilistic models as well as natural language search processing) before focusing on its application to the Web. Part two addresses the foundational aspects of Web IR by discussing the general architecture of search engines (with a focus on the crawling and indexing processes), describing link analysis methods (specifically Page Rank and HITS), addressing recommendation and diversification, and finally presenting advertising in search (the main source of revenues for search engines). The third and final part describes advanced aspects of Web search, each chapter providing a self-contained, up-to-date survey on current Web research directions. Topics in this part include meta-search and multi-domain search, semantic search, search in the context of multimedia data, and crowd search.

The book is ideally suited to courses on information retrieval, as it covers all Web-independent foundational aspects. Its presentation is self-contained and does not require prior background knowledge. It can also be used in the context of classic courses on data management, allowing the instructor to cover both structured and unstructured data in various formats. Its classroom use is facilitated by a set of slides, which can be downloaded from www.search-computing.org.

Reviews

From the reviews:

“The book covers not only a wide range, but everything that is essential to the topic of Web information retrieval. … this book is an excellent starting point into the field of Web information retrieval, and can be recommended for classroom use.” (Gottfried Vossen, zbMATH, Vol. 1283, 2014)

“... this book is a valuable resource for students and instructors in web IR, primarily as a reference to supplement course teaching. Researchers and practitioners should find the book a useful quick reference guide for key concepts, techniques, and recent trends in web IR.” (Wingyan Chung, ACM Computing Reviews, July 2014)

Authors and Affiliations

  • e Informazione, Politecnico di Milano Dipartimento di Elettronica, Milan, Italy

    Stefano Ceri, Alessandro Bozzon, Marco Brambilla, Emanuele Della Valle, Piero Fraternali, Silvia Quarteroni

About the authors

Stefano Ceri is a professor of Database Systems at the Politecnico di Milano and the director of Alta Scuola Politecnica. He is the recipient of the 2013 SIGMOD Edgar F. Codd Innovation Award for a series of influential contributions to several areas of database management, including distributed databases, rule-based systems, web-based application design, and search computing.

Alessandro Bozzon is an assistant professor of Information Retrieval at the Delft University of Technology. His research is on information management on the Web, with specific focus on Information Retrieval and human- and social-computation.

Marco Brambilla is an assistant professor of Software Engineering at Politecnico di Milano and shareholder at WebRatio. His research is on Web modeling tools and methods, spanning crowdsourcing, social networks, search engines, BPM, SOA and enterprise architectures.

Emanuele Della Valle is an assistant professor of Software Project Management at Politecnico di Milano. His research is on Intelligent Web Information Systems and includes Semantic Web, Search Engines, Data Stream Processing, Rank-aware Databases and Crowdsourcing.

Piero Fraternali is a professor of Web Technologies at Politecnico di Milano, co-inventor of the Web Modeling Language, the basis of the WebRatio tool company and of the recent OMG Interaction Flow Modeling Language (IFML). His research focuses on Web development tools and on social-human computation.

Silvia Quarteroni is a senior consultant at Elca Informatique, Switzerland. She holds a Computer Science PhD on Question Answering systems and her main research interests concern statistical approaches to natural language processing.

Bibliographic Information

Buy it now

Buying options

eBook USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book USD 79.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access