Parallel Hoeffding Decision Tree for Streaming Data
Decision trees are well known, widely used algorithm for building efficient classifiers.We propose the modification of the Parallel Hoeffding Tree algorithm that could deal with large streaming data. The proposed method were evaluated on the basis of computer experiment which were carried on few real datasets. The algorithm uses parallel approach and the Hoeffding inequality for better performance with large streaming data. The paper present the analysis of Hoeffding tree and its issues.
Keywordsmachine learning supervised learning decision tree parallel decision tree pattern recognition
Unable to display preview. Download preview PDF.
- 2.Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, pp. 71–80 (2000)Google Scholar
- 3.Hoeffding, W.: Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 13–30 (1963)Google Scholar
- 4.Ikonomovska, E., Gama, J., Zenko, B., Dzeroski, S.: Speeding-up hoeffding-based regression trees with options. In: ICML, pp. 537–544. Omnipress (2011)Google Scholar
- 5.Jin, R., Agrawal, G.: Communication and memory efficient parallel decision tree construction. In: The 3rd SIAM International Conference on Data Mining (2003)Google Scholar
- 6.Jin, R., Agrawal, G.: Efficient decision tree construction on streaming data. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, pp. 571–576 (2003)Google Scholar
- 8.Kufrin, R.: Decision trees on parallel processors. In: Parallel Processing for Artificial Intelligence 3. Elsevier Science, pp. 279–306. Elsevier (1995)Google Scholar
- 11.Quinlan, J.R.: Induction of decision trees. Mach. Learn. (1986)Google Scholar
- 12.Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann (1993)Google Scholar