Advertisement

Learning Higher Accuracy Decision Trees from Concept Drifting Data Streams

  • Satoru Nishimura
  • Masahiro Terabe
  • Kazuo Hashimoto
  • Koichiro Mihara
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5027)

Abstract

In this paper, we propose to combine the naive-Bayes approach with CVFDT, which is known as one of the major algorithms to induce a high-accuracy decision tree from time-changing data streams. The proposed improvement, called CVFDTNBC, induces a decision tree as CVFDT does, but contains naive-Bayes classifiers in the leaf nodes of the induced decision tree. The experiment using the artificially generated time-changing data streams shows that CVFDTNBC can induce a decision tree with more accuracy than CVFDT does.

Keywords

data stream concept drift decision tree naive-Bayes classifiers CVFDT 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Domingos, P., Pazzani, M.: On the Optimality of the Simple Bayesian Classifiers under Zero-One Loss. Machine Learning 29, 103–130 (1997)zbMATHCrossRefGoogle Scholar
  2. 2.
    Domingos, P., Hulten, G.: Mining High-Speed Data Streams. In: Proceeding of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 71–80 (2000)Google Scholar
  3. 3.
    Dougherty, J., Kohavi, R., Sahami, M.: Supervised and Unsupervised Discretization of Continuous Features. In: Proceedings of the Twelfth International Conference on Machine Learning, pp. 194–202 (1995)Google Scholar
  4. 4.
    Gama, J., Rocha, R., Medas, P.: Accurate Decision Trees for Mining High-speed Data Streams. In: Proceedings of the Nineth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 523–528 (2003)Google Scholar
  5. 5.
    Gama, J., Medas, P., Rodrigues, P.: Learning Decision Trees from Dynamic Data Streams. In: Proceedings of the 2005 ACM Symposium on Applied computing, pp. 573–577 (2005)Google Scholar
  6. 6.
    Han, J., Kamber, M.: Data Mining: Concepts and Techiniques, 2nd edn. Morgan Kaufmann, San Francisco (2006)Google Scholar
  7. 7.
    Hulten, G., Spencer, L., Domingos, P.: Mining Time-changing Data Stream. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data mining, pp. 97–106 (2001)Google Scholar
  8. 8.
    Hulten, G., Domingos, P.: VFML – A Toolkit for Mining High-speed Time-changing Data Streams (2003), http://www.cs.washington.edu/dm/vfml/
  9. 9.
    Kubat, M., Widmer, G.: Adapting to Drift in Continuous Domains. In: Proceedings of the Eighth European Conference on Machine Learning, pp. 307–310 (1995)Google Scholar
  10. 10.
    Klinkenberg, R., Joachims, T.: Detecting Concept Drift with Support Vector Machines. In: Proceedings of the Seventeenth International Conference on Machine Learning, pp. 487–494 (2000)Google Scholar
  11. 11.
    Kohavi, R.: Scaling Up the Accuracy of Naive- Bayes Classifiers: a Decision-Tree Hybrid. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, pp. 202–207 (1996)Google Scholar
  12. 12.
    Kohavi, R., Sahami, M.: Error-Based and Entropy-Based Discretization of Continuous Features. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, pp. 114–119 (1996)Google Scholar
  13. 13.
    Langley, P., Iba, W., Thompson, K.: An Analysis of Bayesian Classifiers. In: Proceedings of the Tenth National Conference on Artificial Intelligence, pp. 223–228 (1992)Google Scholar
  14. 14.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)Google Scholar
  15. 15.
    Quinlan, J.R.: Improved Use of Continuous Attributes in C4.5. Journal of Artificial Intelligence Research 4, 77–90 (1996)zbMATHGoogle Scholar
  16. 16.
    Widmer, G., Kubat, M.: Learning in the Presence of Concept Drift and Hidden Contexts. Machine Learning 23, 69–101 (1996)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Satoru Nishimura
    • 1
  • Masahiro Terabe
    • 1
  • Kazuo Hashimoto
    • 1
  • Koichiro Mihara
    • 1
  1. 1.Graduate School of Information ScienceTohoku UniversitySendaiJapan

Personalised recommendations