Data Mining in Large Sets of Complex Data

  • Robson L. F. Cordeiro
  • Christos Faloutsos
  • Caetano Traina Júnior

Part of the SpringerBriefs in Computer Science book series (BRIEFSCOMPUTER)

Table of contents

  1. Front Matter
    Pages i-xi
  2. Robson L. F. Cordeiro, Christos Faloutsos, Caetano Traina Júnior
    Pages 1-6
  3. Robson L. F. Cordeiro, Christos Faloutsos, Caetano Traina Júnior
    Pages 7-20
  4. Robson L. F. Cordeiro, Christos Faloutsos, Caetano Traina Júnior
    Pages 21-32
  5. Robson L. F. Cordeiro, Christos Faloutsos, Caetano Traina Júnior
    Pages 33-67
  6. BoW
    Robson L. F. Cordeiro, Christos Faloutsos, Caetano Traina Júnior
    Pages 69-92
  7. Robson L. F. Cordeiro, Christos Faloutsos, Caetano Traina Júnior
    Pages 93-109
  8. Robson L. F. Cordeiro, Christos Faloutsos, Caetano Traina Júnior
    Pages 111-116

About this book

Introduction

The amount and the complexity of the data gathered by current enterprises are increasing at an exponential rate. Consequently, the analysis of Big Data is nowadays a central challenge in Computer Science, especially for complex data. For example, given a satellite image database containing tens of Terabytes, how can we find regions aiming at identifying native rainforests, deforestation or reforestation? Can it be made automatically? Based on the work discussed in this book, the answers to both questions are a sound “yes”, and the results can be obtained in just minutes. In fact, results that used to require days or weeks of hard work from human specialists can now be obtained in minutes with high precision. Data Mining in Large Sets of Complex Data discusses new algorithms that take steps forward from traditional data mining (especially for clustering) by considering large, complex datasets. Usually, other works focus in one aspect, either data size or complexity. This work considers both: it enables mining complex data from high impact applications, such as breast cancer diagnosis, region classification in satellite images, assistance to climate change forecast, recommendation systems for the Web and social networks; the data are large in the Terabyte-scale, not in Giga as usual; and very accurate results are found in just minutes. Thus, it provides a crucial and well timed contribution for allowing the creation of real time applications that deal with Big Data of high complexity in which mining on the fly can make an immeasurable difference, such as supporting cancer diagnosis or detecting deforestation.

Keywords

Analysis of Breast Cancer Data Analysis of Large Graphs from Social Networks Analysis of Satellite Imagery Big Data Correlation Clustering Data Analysis with MapReduce Data Mining Linear or Quasi-linear Complexity Low-labor Labeling Summarization and Attention Routing

Authors and affiliations

  • Robson L. F. Cordeiro
    • 1
  • Christos Faloutsos
    • 2
  • Caetano Traina Júnior
    • 3
  1. 1.Computer Science Department - ICMCUniversity of São PauloSão CarlosBrazil
  2. 2.School of Computer ScienceCarnegie Mellon UniversityPittsburghUSA
  3. 3.Computer Science Department - ICMCUniversity of São PauloSão CarlosBrazil

Bibliographic information

  • DOI https://doi.org/10.1007/978-1-4471-4890-6
  • Copyright Information The Author(s) 2013
  • Publisher Name Springer, London
  • eBook Packages Computer Science
  • Print ISBN 978-1-4471-4889-0
  • Online ISBN 978-1-4471-4890-6
  • Series Print ISSN 2191-5768
  • Series Online ISSN 2191-5776
  • About this book
Industry Sectors
Pharma
Automotive
Biotechnology
Electronics
Telecommunications
Consumer Packaged Goods
Energy, Utilities & Environment
Aerospace