Skip to main content

Distributed Implementation of an Intelligent Data Classifier

  • Chapter
Soft Computing for Recognition Based on Biometrics

Part of the book series: Studies in Computational Intelligence ((SCI,volume 312))

  • 795 Accesses

Abstract

Industry, science and business applications need to manipulate a huge amount of data every day. Most of the time these data come from distributed sources and are analyzed trying to discover knowledge and recognize patterns using Data Mining techniques. Data classification is a technique that allows to decide if a set of data belongs to a group of information or not. Data classification requires putting all data together in a big centralized datasets. To congregate and analyze this dataset represents a very expensive task in terms of time, memory and bandwidth consuming. Nowadays, architectures for Distributed Data Mining have been developed trying to reduce computing and storage costs. This paper presents an approach to building a distributed data classifier which takes only metadata from distributed datasets avoiding the total access to the original data. Using only metadata reduces the computing time and bandwidth consumption required to build a data classifier.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Artificial Intelligence Unit of University of Dortmund, Yale 4.0., http://rapid-i.com/ (last visit January 2009)

  2. Khoussainov, R., Zuo, X., Kushmerick, N.: Grid-enabled Weka: A Toolkit for Machine Learning on the Grid. ERCIM 59, 47–48 (2004)

    Google Scholar 

  3. McQueen, J.: Some methods for classification and analysis of multivariations. In: Proc. 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1967)

    Google Scholar 

  4. Mierswa, I., Wurst, M., Klinkenberg, R., Scholz, M., Euler, T.: YALE: Rapid Prototyping for Complex Data Mining Tasks. In: 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2006)

    Google Scholar 

  5. Peña, J.M., Sánchez, A., Robles, V., Pérez, M.S., Herrero, P.: Adapting the Weka Data Mining Toolkit to a Grid Based Environment. In: Szczepaniak, P.S., Kacprzyk, J., Niewiadomski, A. (eds.) AWIC 2005. LNCS (LNAI), vol. 3528, pp. 492–497. Springer, Heidelberg (2005)

    Google Scholar 

  6. Quinlan, J.R.: Induction of Decision Trees. Machine Learning 1(1), 81–106 (1986)

    Google Scholar 

  7. Ross Quinlan, J.: C4.5: programs for machine learning. Morgan Kaufmann, San Francisco (1993)

    Google Scholar 

  8. Shaikh Ali, A., Rana, O.F., Taylor, I.J.: Web Services Composition for Distributed Data Mining. In: International Conference Workshop on Parallel Processing, pp. 11–18. IEEE, Los Alamitos (2005)

    Google Scholar 

  9. Statistics Department of the University of Auckland, R Project 2.6.1., http://www.r-project.org/ (last visit November 2008)

  10. Talia, D., Trunfio, P., Verta, O.: Weka4WS: A WSRF-Enabled Weka Toolkit for Distributed Data Mining on Grids. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 309–320. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  11. University of Illinois and Data Mining Research Group and DAIS Research Laboratory, IlliMine 1.1.0., http://illimine.cs.uiuc.edu/ (last visit December 2008)

  12. Williams, G.: Rattle 2.2.74, http://rattle.togaware.com (last visit May 2009)

  13. Witten, H., Frank, E.: Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann Publishers, San Francisco (2005)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Sosa-Sosa, V.J., Lopez-Arevalo, I., Jasso-Luna, O., Fraire-Huacuja, H. (2010). Distributed Implementation of an Intelligent Data Classifier. In: Melin, P., Kacprzyk, J., Pedrycz, W. (eds) Soft Computing for Recognition Based on Biometrics. Studies in Computational Intelligence, vol 312. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15111-8_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15111-8_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15110-1

  • Online ISBN: 978-3-642-15111-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics