Advertisement

© 2017

Outlier Ensembles

An Introduction

Textbook

Table of contents

  1. Front Matter
    Pages i-xvi
  2. Charu C. Aggarwal, Saket Sathe
    Pages 1-34
  3. Charu C. Aggarwal, Saket Sathe
    Pages 35-74
  4. Charu C. Aggarwal, Saket Sathe
    Pages 75-161
  5. Charu C. Aggarwal, Saket Sathe
    Pages 163-186
  6. Charu C. Aggarwal, Saket Sathe
    Pages 187-205
  7. Charu C. Aggarwal, Saket Sathe
    Pages 207-274
  8. Back Matter
    Pages 275-276

About this book

Introduction

This book discusses a variety of methods for outlier ensembles and organizes them by the specific principles with which accuracy improvements are achieved. In addition, it covers the techniques with which such methods can be made more effective. A formal classification of these methods is provided, and the circumstances in which they work well are examined. The authors cover how outlier ensembles relate (both theoretically and practically) to the ensemble techniques used commonly for other data mining problems like classification. The similarities and (subtle) differences in the ensemble techniques for the classification and outlier detection problems are explored. These subtle differences do impact the design of ensemble algorithms for the latter problem.
 
This book can be used for courses in data mining and related curricula. Many illustrative examples and exercises are provided in order to facilitate classroom teaching. A familiarity is assumed to the outlier detection problem and also to generic problem of ensemble analysis in classification. This is because many of the ensemble methods discussed in this book are adaptations from their counterparts in the classification domain. Some techniques explained in this book, such as wagging, randomized feature weighting, and geometric subsampling, provide new insights that are not available elsewhere. Also included is an analysis of the performance of various types of base detectors and their relative effectiveness. The book is valuable for researchers and practitioners for leveraging ensemble methods into optimal algorithmic design.

Keywords

Data mining Outlier ensembles Ensemble analysis Bagging Classification Heterogeneous model combination Base detectors Bias reduction methods Variance-reduction methods Ensemble analysis and models

Authors and affiliations

  1. 1.IBM T. J. Watson Research Center Yorktown HeightsUSA
  2. 2.IBM T. J. Watson Research Center Yorktown HeightsUSA

About the authors

Charu C. Aggarwal is a Distinguished Research Staff Member (DRSM) at the IBM T. J. Watson Research Center in Yorktown Heights, New York. He completed his undergraduate degree in Computer Science  from the Indian Institute of Technology at Kanpur in 1993 and his Ph.D. in Operations Research from the Massachusetts Institute of Technology in 1996. He has published more than 300 papers in refereed conferences and journals, and has applied for or been granted more than 80 patents. He is author or editor of 16 books, including  textbooks on data mining, recommender systems, and outlier analysis. Because of the commercial value of his  patents, he  has thrice been designated a  Master Inventor at IBM. He has received several internal and external awards, including the EDBT  Test-of-Time Award (2014) and the IEEE ICDM Research Contributions Award (2015). He has also served as program or general chair of many major conferences  in data mining. He is a fellow of the  SIAM, ACM, and the IEEE, for ‘”contributions to knowledge discovery and data mining algorithms.”

Saket Sathe has worked at IBM Research  (Australia/United States) since 2013. Saket received a Ph.D. degree in Computer Science from EPFL (Lausanne) in 2013. Before that he received a Master's (M.Tech.)  degree in Electrical Engineering from the Indian Institute of Technology at Bombay and also spent one year working for a startup. His primary areas of interest are data mining and data management. Saket has served on program committees of several top-ranked conferences and has been invited to review papers for prominent peer-reviewed journals. His research has led to more than 20 papers and 5 patents. His work on sensor data management received the runner-up best-paper award in IEEE CollaborateCom 2014. He is a member of the ACM, IEEE, and the SIAM. 


Bibliographic information

Industry Sectors
Pharma
Automotive
Chemical Manufacturing
Biotechnology
IT & Software
Telecommunications
Consumer Packaged Goods
Aerospace
Engineering
Finance, Business & Banking
Electronics