Skip to main content

Parallel Hoeffding Decision Tree for Streaming Data

  • Conference paper
Distributed Computing and Artificial Intelligence

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 217))

  • 1949 Accesses

Abstract

Decision trees are well known, widely used algorithm for building efficient classifiers.We propose the modification of the Parallel Hoeffding Tree algorithm that could deal with large streaming data. The proposed method were evaluated on the basis of computer experiment which were carried on few real datasets. The algorithm uses parallel approach and the Hoeffding inequality for better performance with large streaming data. The paper present the analysis of Hoeffding tree and its issues.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ben-Haim, Y., Tom-Tov, E.: A streaming parallel decision tree algorithm. J. Mach. Learn. Res. 11, 849–872 (2010)

    MATH  MathSciNet  Google Scholar 

  2. Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, pp. 71–80 (2000)

    Google Scholar 

  3. Hoeffding, W.: Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 13–30 (1963)

    Google Scholar 

  4. Ikonomovska, E., Gama, J., Zenko, B., Dzeroski, S.: Speeding-up hoeffding-based regression trees with options. In: ICML, pp. 537–544. Omnipress (2011)

    Google Scholar 

  5. Jin, R., Agrawal, G.: Communication and memory efficient parallel decision tree construction. In: The 3rd SIAM International Conference on Data Mining (2003)

    Google Scholar 

  6. Jin, R., Agrawal, G.: Efficient decision tree construction on streaming data. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, pp. 571–576 (2003)

    Google Scholar 

  7. Kacprzak, T., Walkowiak, K., Wozniak, M.: Optimization of overlay distributed computing systems for multiple classifier system - heuristic approach. Logic Journal of the IGPL 20(4), 677–688 (2012)

    Article  MathSciNet  Google Scholar 

  8. Kufrin, R.: Decision trees on parallel processors. In: Parallel Processing for Artificial Intelligence 3. Elsevier Science, pp. 279–306. Elsevier (1995)

    Google Scholar 

  9. Mitchell, T.M.: Machine Learning, 1st edn. McGraw-Hill, Inc., New York (1997)

    MATH  Google Scholar 

  10. Pfahringer, B., Holmes, G., Kirkby, R.: New options for hoeffding trees. In: Orgun, M.A., Thornton, J. (eds.) AI 2007. LNCS (LNAI), vol. 4830, pp. 90–99. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  11. Quinlan, J.R.: Induction of decision trees. Mach. Learn. (1986)

    Google Scholar 

  12. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann (1993)

    Google Scholar 

  13. Yildiz, O.T., Dikmen, O.: Parallel univariate decision trees. Pattern Recogn. Lett. 28(7), 825–832 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Piotr Cal .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer International Publishing Switzerland

About this paper

Cite this paper

Cal, P., Woźniak, M. (2013). Parallel Hoeffding Decision Tree for Streaming Data. In: Omatu, S., Neves, J., Rodriguez, J., Paz Santana, J., Gonzalez, S. (eds) Distributed Computing and Artificial Intelligence. Advances in Intelligent Systems and Computing, vol 217. Springer, Cham. https://doi.org/10.1007/978-3-319-00551-5_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-00551-5_4

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-00550-8

  • Online ISBN: 978-3-319-00551-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics