© 2019

Evolutionary Decision Trees in Large-Scale Data Mining


Part of the Studies in Big Data book series (SBD, volume 59)

Table of contents

  1. Front Matter
    Pages i-xi
  2. Background

    1. Front Matter
      Pages 1-1
    2. Marek Kretowski
      Pages 3-20
    3. Marek Kretowski
      Pages 21-48
    4. Marek Kretowski
      Pages 49-68
  3. The Approach

    1. Front Matter
      Pages 69-69
    2. Marek Kretowski
      Pages 71-99
    3. Marek Kretowski
      Pages 101-113
  4. Extensions

    1. Front Matter
      Pages 115-115
    2. Marek Kretowski
      Pages 117-129
  5. Large-Scale Mining

    1. Front Matter
      Pages 143-143
    2. Marek Kretowski
      Pages 145-174
  6. Back Matter
    Pages 175-180

About this book


This book presents a unified framework, based on specialized evolutionary algorithms, for the global induction of various types of classification and regression trees from data. The resulting univariate or oblique trees are significantly smaller than those produced by standard top-down methods, an aspect that is critical for the interpretation of mined patterns by domain analysts. The approach presented here is extremely flexible and can easily be adapted to specific data mining applications, e.g. cost-sensitive model trees for financial data or multi-test trees for gene expression data. The global induction can be efficiently applied to large-scale data without the need for extraordinary resources. With a simple GPU-based acceleration, datasets composed of millions of instances can be mined in minutes. In the event that the size of the datasets makes the fastest memory computing impossible, the Spark-based implementation on computer clusters, which offers impressive fault tolerance and scalability potential, can be applied.


Evolutionary Computation Decision Trees Distributed Computing Evolutionary Induction of Decision Trees Evolutionary Decision Trees Large-Scale Data Mining

Authors and affiliations

  1. 1.Faculty of Computer ScienceBialystok University of TechnologyBialystokPoland

Bibliographic information

Industry Sectors
IT & Software


“The structure of the book is well-thought-out. … I recommend the book for students, researchers, and developers interested in real-life applications of big data analysis.” (K. Balogh, Computing Reviews, February 15, 2021)