Advertisement

A Study of Decision Tree Induction for Data Stream Mining Using Boosting Genetic Programming Classifier

  • Dirisala J. Nagendra Kumar
  • J. V. R. Murthy
  • Suresh Chandra Satapathy
  • S. V. V. S. R. Kumar Pullela
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7076)

Abstract

Genetic Programming is an evolutionary soft computing approach. Data streams are the order of the day input mechanisms. Here is a study of GP Classifier on Data Streams. GP classification performance is compared to that of other state-of-the-art data mining and stream classification approaches. Boosting is a machine learning meta-algorithm for performing supervised learning. A weak learner is defined to be a classifier which is only slightly correlated with the true classification (it can label examples better than random guessing). In contrast, a strong learner is a classifier that is arbitrarily well-correlated with the true classification. Boosting combines a set of weak learners to create a strong learner. It is observed that the Boosting GP approach is beating Boosting Naïve Bayes classification. Hence it is found that GP is a competent algorithm for Data Stream classification.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Loveard, T., Ciesielski, V.: Representing classification problems in genetic programming. In: Proc. Congr. Evolutionary Computation, May 27-30, pp. 1070–1077 (2001)Google Scholar
  2. 2.
    Kishore, J.K., Patnaik, L.M., Mani, V., Agrawal, V.K.: Application of genetic programming for multicategory pattern classification. IEEE Transaction on Evolutionary Computation 4, 242–258 (2000)CrossRefGoogle Scholar
  3. 3.
    Muni, D.P., Pal, N.R., Das, J.: A novel approach for designing classifiers using genetic programming. IEEE Trans. Evolut. Comput. 8(2), 183–196 (2004)CrossRefGoogle Scholar
  4. 4.
    Muni, D.P., Pal, N.R., Das, J.: Genetic programming for simultaneous feature selection and classifier design. IEEE Transactions on Systems, Man, and Cybernetics, Part B 36(1), 106–117 (2006)CrossRefGoogle Scholar
  5. 5.
    Paul, T.K., Iba, H.: Prediction of Cancer class with Majority Voting Genetic Programming Classifier Using Gene Expression Data. IEEE/ACM Trans. on Computational Biology and Bioinformatics 6(2), 363–367 (2009)CrossRefGoogle Scholar
  6. 6.
    Han, J., Kamber, M.: Data Mining Concepts and Techniques, 2nd edn. Elsevier (2006)Google Scholar
  7. 7.
    Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, Reading (1989)zbMATHGoogle Scholar
  8. 8.
    Koza, J.R.: Genetic Programming: On the programming of Computers by Means of Natural Selection. MIT Press, Cambridge (1992)zbMATHGoogle Scholar
  9. 9.
    Poli, R., Langdon, W.B., McPhee, N.F.: A field guide to Genetic Programming (March 2008), http://www.gp-field-guide.org.uk
  10. 10.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. John Wiley and Sons (2001)Google Scholar
  11. 11.
    Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning internal representation by error propagation. In: Rumelhart, D.E., McClelland, J.L. (eds.) Parallel Distributed Processing. MIT Press (1986)Google Scholar
  12. 12.
    Breiman, L.: Bagging predictors. Machine Learning 24, 123–140 (1996)zbMATHGoogle Scholar
  13. 13.
    Tan, P.-N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Person Education (2006)Google Scholar
  14. 14.
    Nagendra Kumar, D.J., Satapathy, S.C., Murthy, J.V.R.: A scalable genetic programming multi-class ensemble classifier. In: World Congress on Nature & Biologically Inspired Computing, NaBIC 2009, pp. 1201–1206 (2009), doi:10.1109/NABIC.2009.5393788Google Scholar
  15. 15.
    Masud, M.M., Gao, J., Khan, L., Han, J., Thuraisingham, B.: Integrating Novel Class Detection with Classification for Concept-Drifting Data Streams. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds.) ECML PKDD 2009. LNCS, vol. 5782, pp. 79–94. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  16. 16.
    Folino, G., Pizzuti, C., Spezzano, G.: An Adaptive Distributed Ensemble Approach to Mine Concept-Drifting Data Streams. In: ICTAI 2007, Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence, vol. 02 (2007)Google Scholar
  17. 17.
    Abbass, H.A., Bacardit, J., Butz, M.V., Llorà, X.: Online Adaptation in Learning Classifier Systems: Stream Data Mining (2004)Google Scholar
  18. 18.
    Zhang, Y., Jin, X.: An automatic construction and organization strategy for ensemble learning on data streams. ACM SIGMOD Record Homepage Archive 35(3) (September 2006)Google Scholar
  19. 19.
    Wu, W., Gruenwald, L.: Research issues in mining multiple data streams. In: StreamKDD 2010, Proceedings of the First International Workshop on Novel Data Stream Pattern Mining Techniques (2010)Google Scholar
  20. 20.
    Bifet, A., Holmes, G., Pfahringer, B., Kirkby, R., Gavaldà, R.: New ensemble methods for evolving data streams. In: 15th ACM SIGKDD Intl. Conference on Knowledge Discovery and Data Mining (KDD 2009), Paris, France (June 2009)Google Scholar
  21. 21.
    Folino, G., Pizzuti, C., Spezzano, G.: Boosting Technique for Combining Cellular GP Classifiers. In: Keijzer, M., O’Reilly, U.-M., Lucas, S., Costa, E., Soule, T. (eds.) EuroGP 2004. LNCS, vol. 3003, pp. 47–56. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  22. 22.
    Paris, G., Robilliard, D., Fonlupt, C.: Genetic Programming with Boosting for Ambiguities in Regression Problems. In: Ryan, C., Soule, T., Keijzer, M., Tsang, E.P.K., Poli, R., Costa, E. (eds.) EuroGP 2003. LNCS, vol. 2610, pp. 183–193. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  23. 23.
    Teredesai, A., Govindaraju, V.: Issues in Evolving GP based Classifiers for a Pattern Recognition Task. In: Proceedings of the 2004 IEEE Congress on Evolutionary Computation, pp. 509–515. IEEE Press (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Dirisala J. Nagendra Kumar
    • 1
  • J. V. R. Murthy
    • 2
  • Suresh Chandra Satapathy
    • 3
  • S. V. V. S. R. Kumar Pullela
    • 3
  1. 1.BVRICEBhimavaramIndia
  2. 2.JNTUCEKakinadaIndia
  3. 3.V.S. Lakshmi Engineering CollegeKakinadaIndia

Personalised recommendations