Advertisement

Learning from Imbalanced Data Sets

  • Alberto Fernández
  • Salvador García
  • Mikel Galar
  • Ronaldo C. Prati
  • Bartosz Krawczyk
  • Francisco Herrera

Table of contents

  1. Front Matter
    Pages i-xviii
  2. Alberto Fernández, Salvador García, Mikel Galar, Ronaldo C. Prati, Bartosz Krawczyk, Francisco Herrera
    Pages 1-17
  3. Alberto Fernández, Salvador García, Mikel Galar, Ronaldo C. Prati, Bartosz Krawczyk, Francisco Herrera
    Pages 19-46
  4. Alberto Fernández, Salvador García, Mikel Galar, Ronaldo C. Prati, Bartosz Krawczyk, Francisco Herrera
    Pages 47-61
  5. Alberto Fernández, Salvador García, Mikel Galar, Ronaldo C. Prati, Bartosz Krawczyk, Francisco Herrera
    Pages 63-78
  6. Alberto Fernández, Salvador García, Mikel Galar, Ronaldo C. Prati, Bartosz Krawczyk, Francisco Herrera
    Pages 79-121
  7. Alberto Fernández, Salvador García, Mikel Galar, Ronaldo C. Prati, Bartosz Krawczyk, Francisco Herrera
    Pages 123-146
  8. Alberto Fernández, Salvador García, Mikel Galar, Ronaldo C. Prati, Bartosz Krawczyk, Francisco Herrera
    Pages 147-196
  9. Alberto Fernández, Salvador García, Mikel Galar, Ronaldo C. Prati, Bartosz Krawczyk, Francisco Herrera
    Pages 197-226
  10. Alberto Fernández, Salvador García, Mikel Galar, Ronaldo C. Prati, Bartosz Krawczyk, Francisco Herrera
    Pages 227-251
  11. Alberto Fernández, Salvador García, Mikel Galar, Ronaldo C. Prati, Bartosz Krawczyk, Francisco Herrera
    Pages 253-277
  12. Alberto Fernández, Salvador García, Mikel Galar, Ronaldo C. Prati, Bartosz Krawczyk, Francisco Herrera
    Pages 279-303
  13. Alberto Fernández, Salvador García, Mikel Galar, Ronaldo C. Prati, Bartosz Krawczyk, Francisco Herrera
    Pages 305-325
  14. Alberto Fernández, Salvador García, Mikel Galar, Ronaldo C. Prati, Bartosz Krawczyk, Francisco Herrera
    Pages 327-349
  15. Alberto Fernández, Salvador García, Mikel Galar, Ronaldo C. Prati, Bartosz Krawczyk, Francisco Herrera
    Pages 351-377

About this book

Introduction

This  book provides a general and comprehensible overview of   imbalanced learning.  It contains a formal description of a problem, and focuses on its main features, and the most relevant proposed solutions. Additionally, it considers the different scenarios in Data Science for which the imbalanced classification can create a real challenge. 

This book stresses the gap with standard classification tasks by reviewing the case studies and ad-hoc performance metrics that are applied in this area. It also covers the different approaches that have been traditionally applied to address the binary skewed class distribution. Specifically, it reviews cost-sensitive learning, data-level preprocessing methods and algorithm-level solutions, taking also into account those ensemble-learning solutions that embed any of the former alternatives. Furthermore, it focuses on the extension of the problem for multi-class problems, where the former classical methods are no longer to be applied in a straightforward way.

This book also focuses on the data intrinsic characteristics that are the main causes which, added to the uneven class distribution, truly hinders the performance of classification algorithms in this scenario. Then, some notes on data reduction are provided in order to understand the advantages related to the use of this type of approaches.

Finally this book introduces some novel areas of study that are gathering a deeper attention on the imbalanced data issue. Specifically, it considers the classification of data streams, non-classical classification problems, and the scalability related to Big Data. Examples of software libraries and modules to address imbalanced classification are provided.

This book is highly suitable for technical professionals, senior undergraduate and graduate students in the areas of data science, computer science and engineering.  It will also be useful for scientists and researchers to gain insight on the current developments in this area of study, as well as future research directions. 

Keywords

Machine learning Data mining Classification Imbalanced data Data preprocessing Ensemble learning Cost-sensitive Learning Data Reduction Dimensionality reduction Data Streams Big Data

Authors and affiliations

  • Alberto Fernández
    • 1
  • Salvador García
    • 2
  • Mikel Galar
    • 3
  • Ronaldo C. Prati
    • 4
  • Bartosz Krawczyk
    • 5
  • Francisco Herrera
    • 6
  1. 1.Department of Computer Science and AIUniversity of GranadaGranadaSpain
  2. 2.Department of Computer Science and AIUniversity of GranadaGranadaSpain
  3. 3.Institute of Smart CitiesPublic University of NavarrePamplonaSpain
  4. 4.Department of Computer ScienceUniversidade Federal do ABCSanto AndreBrazil
  5. 5.Department of Computer ScienceVirginia Commonwealth UniversityRichmondUSA
  6. 6.Department of Computer Science and AIUniversity of GranadaGranadaSpain

Bibliographic information

  • DOI https://doi.org/10.1007/978-3-319-98074-4
  • Copyright Information Springer Nature Switzerland AG 2018
  • Publisher Name Springer, Cham
  • eBook Packages Computer Science
  • Print ISBN 978-3-319-98073-7
  • Online ISBN 978-3-319-98074-4
  • Buy this book on publisher's site
Industry Sectors
Pharma
Automotive
Chemical Manufacturing
Biotechnology
Finance, Business & Banking
Electronics
IT & Software
Telecommunications
Energy, Utilities & Environment
Aerospace
Engineering