Parallel Hoeffding Decision Tree for Streaming Data

  • Piotr CalEmail author
  • Michał Woźniak
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 217)


Decision trees are well known, widely used algorithm for building efficient classifiers.We propose the modification of the Parallel Hoeffding Tree algorithm that could deal with large streaming data. The proposed method were evaluated on the basis of computer experiment which were carried on few real datasets. The algorithm uses parallel approach and the Hoeffding inequality for better performance with large streaming data. The paper present the analysis of Hoeffding tree and its issues.


machine learning supervised learning decision tree parallel decision tree pattern recognition 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ben-Haim, Y., Tom-Tov, E.: A streaming parallel decision tree algorithm. J. Mach. Learn. Res. 11, 849–872 (2010)zbMATHMathSciNetGoogle Scholar
  2. 2.
    Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, pp. 71–80 (2000)Google Scholar
  3. 3.
    Hoeffding, W.: Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 13–30 (1963)Google Scholar
  4. 4.
    Ikonomovska, E., Gama, J., Zenko, B., Dzeroski, S.: Speeding-up hoeffding-based regression trees with options. In: ICML, pp. 537–544. Omnipress (2011)Google Scholar
  5. 5.
    Jin, R., Agrawal, G.: Communication and memory efficient parallel decision tree construction. In: The 3rd SIAM International Conference on Data Mining (2003)Google Scholar
  6. 6.
    Jin, R., Agrawal, G.: Efficient decision tree construction on streaming data. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, pp. 571–576 (2003)Google Scholar
  7. 7.
    Kacprzak, T., Walkowiak, K., Wozniak, M.: Optimization of overlay distributed computing systems for multiple classifier system - heuristic approach. Logic Journal of the IGPL 20(4), 677–688 (2012)CrossRefMathSciNetGoogle Scholar
  8. 8.
    Kufrin, R.: Decision trees on parallel processors. In: Parallel Processing for Artificial Intelligence 3. Elsevier Science, pp. 279–306. Elsevier (1995)Google Scholar
  9. 9.
    Mitchell, T.M.: Machine Learning, 1st edn. McGraw-Hill, Inc., New York (1997)zbMATHGoogle Scholar
  10. 10.
    Pfahringer, B., Holmes, G., Kirkby, R.: New options for hoeffding trees. In: Orgun, M.A., Thornton, J. (eds.) AI 2007. LNCS (LNAI), vol. 4830, pp. 90–99. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  11. 11.
    Quinlan, J.R.: Induction of decision trees. Mach. Learn. (1986)Google Scholar
  12. 12.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann (1993)Google Scholar
  13. 13.
    Yildiz, O.T., Dikmen, O.: Parallel univariate decision trees. Pattern Recogn. Lett. 28(7), 825–832 (2007)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2013

Authors and Affiliations

  1. 1.Department of Systems and Computer NetworksWrocław University of TechnologyWrocławPoland

Personalised recommendations