© 2011

Web Data Mining

Exploring Hyperlinks, Contents, and Usage Data

  • Covers all key tasks and techniques of Web search and Web mining, i.e., structure mining, content mining, and usage mining

  • Includes major algorithms from data mining, machine learning, information retrieval and text processing, which are crucial for many Web mining tasks

  • Contains a rich blend of theory and practice, addressing seminal research ideas and also looking at the technology from a practical point of view

  • Second edition includes new/revised sections on supervised learning, opinion mining and sentiment analysis, recommender systems and collaborative filtering, and query log mining

  • Ideally suited for classes on data mining, Web mining, Web search, and knowledge discovery in data bases

  • Provides internet support with lecture slides and project problems


Part of the Data-Centric Systems and Applications book series (DCSA)

Table of contents

  1. Front Matter
    Pages I-XX
  2. Bing Liu
    Pages 1-14
  3. Data Mining Foundations

    1. Front Matter
      Pages 15-15
    2. Bing Liu
      Pages 63-132
    3. Bing Liu
      Pages 133-169
    4. Bing Liu, Wee Sun Lee
      Pages 171-208
  4. Web Mining

    1. Front Matter
      Pages 209-209
    2. Bing Liu
      Pages 211-268
    3. Bing Liu
      Pages 269-309
    4. Bing Liu, Filippo Menczer
      Pages 311-362
    5. Bing Liu
      Pages 425-458
    6. Bing Liu
      Pages 459-526
    7. Bing Liu, Bamshad Mobasher, Olfa Nasraoui
      Pages 527-603
  5. Back Matter
    Pages 605-622

About this book


Web mining aims to discover useful information and knowledge from Web hyperlinks, page contents, and usage data. Although Web mining uses many conventional data mining techniques, it is not purely an application of traditional data mining due to the semi-structured and unstructured nature of the Web data. The field has also developed many of its own algorithms and techniques.

Liu has written a comprehensive text on Web mining, which consists of two parts. The first part covers the data mining and machine learning foundations, where all the essential concepts and algorithms of data mining and machine learning are presented. The second part covers the key topics of Web mining, where Web crawling, search, social network analysis, structured data extraction, information integration, opinion mining and sentiment analysis, Web usage mining, query log mining, computational advertising, and recommender systems are all treated both in breadth and in depth. His book thus brings all the related concepts and algorithms together to form an authoritative and coherent text.

The book offers a rich blend of theory and practice. It is suitable for students, researchers and practitioners interested in Web mining and data mining both as a learning text and as a reference book. Professors can readily use it for classes on data mining, Web mining, and text mining. Additional teaching materials such as lecture slides, datasets, and implemented algorithms are available online.


Information Integration Information Retrieval Machine Learning Opinion Mining Pattern Mining Recommender Systems Schema Matching Semi-Supervised Learning Social Network Analysis Structured Data Extraction Unsupervised Learning Web Crawling Web Data Mining Web Link Analysis Web Search Web Usage Mining Wrapper Generation

Authors and affiliations

  1. 1.Dept. Computer ScienceUniversity of Illinois, ChicagoChicagoUSA

About the authors

Bing Liu is a professor of Computer Science at the University of Illinois at Chicago (UIC). He received his PhD in Artificial Intelligence from the University of Edinburgh. Before joining UIC, he was with the National University of Singapore. His current research interests include opinion mining and sentiment analysis, text and Web mining, data mining, and machine learning. He has published extensively in top journals and conferences in these fields. Several of his publications are considered seminal papers of the fields and are highly cited. He has also given more than 30 keynote and invited talks in academia and in industry. On professional services, Liu has served as associate editors of IEEE Transactions on Knowledge and Data Engineering (TKDE), Journal of Data Mining and Knowledge Discovery (DMKD), and SIGKDD Explorations, and is on the editorial boards of several other journals. He has also served as program chairs of IEEE International Conference on Data Mining (ICDM-2010), ACM Conference on Web Search and Data Mining (WSDM-2010), ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2008), SIAM Conference on Data Mining (SDM-2007), ACM Conference on Information and Knowledge Management (CIKM-2006), and Pacific Asia Conference on Data Mining (PAKDD-2002). Additionally, Liu has served extensively as area chairs and program committee members of leading conferences on data mining, Web mining, natural language processing, and machine learning. More information about him can be found from

Bibliographic information

Industry Sectors
IT & Software
Consumer Packaged Goods
Finance, Business & Banking


From the reviews:

"This is a textbook about data mining and its application to the Web. […] Liu succeeds in helping readers appreciate the key role that data mining and machine learning play in Web applications. […] It also motivates the student by adding immediacy and relevance to the concepts and algorithms described. I liked the way the concepts are introduced in a stepwise manner. […] I also appreciated the bibliographical notes at the end of each chapter." ACM Computing Reviews, W. Hu, , January 2009

From the reviews of the second edition:

“Liu (Univ. of Illinois, Chicago) discusses all three types of Web mining--structure, content, and usage--in the technology’s efforts to glean information from hyperlinks, Web page content, and usage logs. […] Practical examples complement the discussions throughout the text, and each chapter includes useful ‘Bibliographic Notes’ and an extensive bibliography. […] Liu states that his intended audience includes both undergraduate and graduate students, but notes that researchers and Web programmers could benefit from this text as well. Summing Up: Recommended. Upper-division undergraduates through professionals.” J. Johnson, Choice, Vol. 49 (5), January 2012

"[...] Liu's book provides a comprehensive, self-contained introduction to the major data mining techniques and their use in Web data mining. [...] Professionals and researchers alike will find this excellent book handy as a reference. Its extensive lists of references at the end of each chapter provide hundreds of pointers for further reading. As a textbook, it is also suitable for advanced undergraduate and graduate courses on Web mining; it is highly selfcontained and includes many easy-to-understand examples that will help readers grasp the key ideas behind current Web data mining techniques." ACM Computing Reviews, Fernando Berzal, February 2012