© 2010

Automating the Design of Data Mining Algorithms

An Evolutionary Computation Approach


Part of the Natural Computing Series book series (NCS)

Table of contents

  1. Front Matter
    Pages I-XIII
  2. Gisele L. Pappa, Alex A. Freitas
    Pages 1-16
  3. Gisele L. Pappa, Alex A. Freitas
    Pages 17-46
  4. Gisele L. Pappa, Alex A. Freitas
    Pages 47-84
  5. Gisele L. Pappa, Alex A. Freitas
    Pages 85-108
  6. Gisele L. Pappa, Alex A. Freitas
    Pages 109-135
  7. Back Matter
    Pages 185-187

About this book


Data mining is a very active research area with many successful real-world app- cations. It consists of a set of concepts and methods used to extract interesting or useful knowledge (or patterns) from real-world datasets, providing valuable support for decision making in industry, business, government, and science. Although there are already many types of data mining algorithms available in the literature, it is still dif cult for users to choose the best possible data mining algorithm for their particular data mining problem. In addition, data mining al- rithms have been manually designed; therefore they incorporate human biases and preferences. This book proposes a new approach to the design of data mining algorithms. - stead of relying on the slow and ad hoc process of manual algorithm design, this book proposes systematically automating the design of data mining algorithms with an evolutionary computation approach. More precisely, we propose a genetic p- gramming system (a type of evolutionary computation method that evolves c- puter programs) to automate the design of rule induction algorithms, a type of cl- si cation method that discovers a set of classi cation rules from data. We focus on genetic programming in this book because it is the paradigmatic type of machine learning method for automating the generation of programs and because it has the advantage of performing a global search in the space of candidate solutions (data mining algorithms in our case), but in principle other types of search methods for this task could be investigated in the future.


Evolutionary algorithms Evolutionary computing Genetic programming Machine learning Rule induction algorithms classification data mining tar

Authors and affiliations

  1. 1.Depto. Ciência da ComputaçãoUniversidade Federal de Minas GeraisBelo HorizonteBrazil
  2. 2.Computing LaboratoryUniversity of KentCanterburyUnited Kingdom

Bibliographic information

  • Book Title Automating the Design of Data Mining Algorithms
  • Book Subtitle An Evolutionary Computation Approach
  • Authors Gisele L. Pappa
    Alex Freitas
  • Series Title Natural Computing Series
  • DOI
  • Copyright Information Springer-Verlag Berlin Heidelberg 2010
  • Publisher Name Springer, Berlin, Heidelberg
  • eBook Packages Computer Science Computer Science (R0)
  • Hardcover ISBN 978-3-642-02540-2
  • Softcover ISBN 978-3-642-26125-1
  • eBook ISBN 978-3-642-02541-9
  • Series ISSN 1619-7127
  • Edition Number 1
  • Number of Pages XIII, 187
  • Number of Illustrations 33 b/w illustrations, 0 illustrations in colour
  • Topics Data Mining and Knowledge Discovery
    Data Structures
    Artificial Intelligence
  • Buy this book on publisher's site
Industry Sectors
IT & Software
Consumer Packaged Goods
Finance, Business & Banking
Energy, Utilities & Environment
Oil, Gas & Geosciences


From the reviews:

"The book is targeted at researchers and postgraduate students. As the amount of data being mined continues to grow it demands ever more sophisticated mining algorithms. Therefore there is a need for new algorithms and so Pappa and Freitas’ book will be of interest particularly to researchers in data mining. ... [T]his book will appeal to the target audience of [the journal] Genetic Programming and Evolvable Machines and, I feel, will align with the research interests of its readership." (John Woodward, Genetic Programming and Evolvable Machines (2011) 12:81–83)

“The book will be useful for postgraduate students and researchers in the data mining field and in evolutionary computation.” (Florin Gorunescu, Zentralblatt MATH, Vol. 1183, 2010)