An Ensemble Classification Algorithm Based on Information Entropy for Data Streams

  • Junhong WangEmail author
  • Shuliang Xu
  • Bingqian Duan
  • Caifeng Liu
  • Jiye Liang


Data stream mining has attracted much attention from scholars. In recent researches, ensemble classification has been wide aplied in concept drift detection; however, most of them regard classification accuracy as a criterion for judging whether concept drift happens or not. Information entropy is an important and effective method for measuring uncertainty. Based on the information entropy theory, a new algorithm using information entropy to evaluate a classification result is developed. It utilizes the methods of ensemble learning and the weight of each classifier is decided by the entropy of the result produced by an ensemble classifiers system. When the concept in data stream changes, the classifiers whose weight are below a predefined threshold will be abandoned to adapt to a new concept. In the experimental analysis, the proposed algorithm and six comparision algorithms are executed on six experimental data sets. The results show that the proposed method can not only handle concept drift effectively, but also have a better performance than the comparision algorithms.


Data streams Data mining Concept drift Information entropy Ensemble classification 



This research was supported by the National Natural Science Foundation of China (Nos. 61772323, 61202018, 61432011, and U1435212), the National Key Basic Research and Development Program of China (973) (No. 2013CB329404), and the Natural Science Foundation of Shanxi Province, China (Nos. 201701D121051 and 201701D221098). The authors are grateful to the editor and the anonymous reviewers for constructive comments that helped to improve the quality and presentation of this paper.


  1. 1.
    Abdulsalam H, Skillicorn DB, Martin P (2007) Streaming random forests. In: 11th International database engineering & applications symposium, pp 225–232Google Scholar
  2. 2.
    Becker H, Arias M (2007) Real-time ranking with concept drift using expert advice. In: ACM SIGKDD international conference on knowledge discovery & data mining, pp 86–94Google Scholar
  3. 3.
    Bifet A, Holmes G, Kirkby R, Pfahringer B (2010) Massive online analysis. J Mach Learn Res 11(2):1601–1604Google Scholar
  4. 4.
    Bifet A, Holmes G, Pfahringer B, Kirkby R (2009) New ensemble methods for evolving data streams. In: ACM SIGKDD international conference on knowledge discovery & data mining. ACM 2009, pp 139–148Google Scholar
  5. 5.
    Brzezinski D, Stefanowski J (2013) Reacting to different types of concept drift: the accuracy updated ensemble algorithm. IEEE Trans Neural Netw Learn Syst 25(1):81–94CrossRefGoogle Scholar
  6. 6.
    Brzezinski D, Stefanowski J (2014) Combining block-based and online methods in learning ensembles from concept drifting data streams. Inf Sci 265(5):50–67MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Czarnowski I, Jedrzejowicz P (2014) Ensemble classifier for mining data streams. Procedia Comput Sci 35(9):397–406CrossRefGoogle Scholar
  8. 8.
    Domingos P, Hulten G (2000) Mining high-speed data streams. In: Proceedings of the sixth ACM SIGKDD international conference on knowledge discovery and data mining, pp 71–80Google Scholar
  9. 9.
    Domingos P, Hulten G (2001) A general method for scaling up machine learning algorithms and its application to clustering. In: Proceedings of the 18th international conference on machine learning, pp 106–113Google Scholar
  10. 10.
    Elwell R, Polikar R (2011) Incremental learning of concept drift in nonstationary environments. IEEE Trans Neural Netw 22(10):1517–1531CrossRefGoogle Scholar
  11. 11.
    Escandell-Montero P, Lorente D, Martnez-Martnez JM, Soria-Olivas E, Martn-Guerrero JD (2016) Online fitted policy iteration based on extreme learning machines. Knowl-Based Syst. 100:200–211CrossRefGoogle Scholar
  12. 12.
    Farid D, Li Z, Hossain A, Rahman C, Strachan R, Sexton G, Dahal K (2013) An adaptive ensemble classifier for mining concept drifting data streams. Expert Syst Appl 40(15):5895–5906CrossRefGoogle Scholar
  13. 13.
    Gama J, Medas P, Rodrigues P (2005) Learning decision trees from dynamic data streams. In: Acm symposium on applied computing, pp 573–577Google Scholar
  14. 14.
    Gama J, Sebastiao R, Rodrigues P (2009) Issues in evaluation of stream learning algorithms. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 329–338Google Scholar
  15. 15.
    Gomes HM, Enembreck F (2013) Sae: social adaptive ensemble classifier for data streams. In: Computational intelligence & data mining, pp 199–206Google Scholar
  16. 16.
    Gu Y, Liu J, Chen Y, Jiang X, Yu H (2014) Toselm: timeliness online sequential extreme learning machine. Neurocomputing 128(27):119–127CrossRefGoogle Scholar
  17. 17.
    Huang G, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern Part B 42(2):513–529CrossRefGoogle Scholar
  18. 18.
    Huang G, Zhu Q, Siew C (2005) Extreme learning machine: a new learning scheme of feedforward neural networks. In: IEEE international joint conference on neural networksGoogle Scholar
  19. 19.
    Huang G, Zhu Q, Siew C (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1):489–501CrossRefGoogle Scholar
  20. 20.
    Kolter JZ, Maloof M.A (2005) Using additive expert ensembles to cope with concept drift. In: International conference on machine learning, pp 449–456Google Scholar
  21. 21.
    Kumar V, Gaur P, Mittal AP (2013) Trajectory control of dc servo using os-elm based controller. In: Power India conference, pp 1–5Google Scholar
  22. 22.
    Li P, Wu X, Hu X, Hao W (2015) Learning concept-drifting data streams with random ensemble decision trees. Neurocomputing 166(C):68–83CrossRefGoogle Scholar
  23. 23.
    Liang NY, Huang GB, Saratchandran P, Sundararajan N (2006) A fast and accurate online sequential learning algorithm for feedforward networks. IEEE Trans Neural Netw 17(6):1411–23CrossRefGoogle Scholar
  24. 24.
    Lim J, Lee S, Pang H (2013) Low complexity adaptive forgetting factor for online sequential extreme learning machine (os-elm) for application to nonstationary system estimations. Neural Comput Appl 22(3–4):569–576CrossRefGoogle Scholar
  25. 25.
    Liu D, Wu Y, Jiang H (2016) Fp-elm: an online sequential learning algorithm for dealing with concept drift. Neurocomputing 207:322–334CrossRefGoogle Scholar
  26. 26.
    Ma Z, Luo G, Huang D (2016) Short term traffic flow prediction based on on-line sequential extreme learning machine. In: Eighth international conference on advanced computational intelligence, pp 143–149Google Scholar
  27. 27.
    Minku L, Yao X (2012) Ddd: a new ensemble approach for dealing with concept drift. IEEE Trans Knowl Data Eng 24(4):619–633CrossRefGoogle Scholar
  28. 28.
    Ouyang Z, Min Z, Tao W, Wu Q (2009) Mining concept-drifting and noisy data streams using ensemble classifiers. In: International conference on artificial intelligence & computational intelligence, pp 360–364Google Scholar
  29. 29.
    Ramamurthy S, Bhatnagar R (2007) Tracking recurrent concept drift in streaming data using ensemble classifiers. In: International conference on machine learning & applications, pp 404–409Google Scholar
  30. 30.
    Rushing J, Graves S, Criswell E.e.a (2004) A coverage based ensemble algorithm (cbea) for streaming data. In: IEEE international conference on tools with artificial intelligence, pp 106–112Google Scholar
  31. 31.
    Rutkowski L, Jaworski M, Pietruczuk L, Duda P (2013) Decision trees for mining data streams based on the Gaussian approximation. IEEE Trans Knowl Data Eng 25(6):1272–1279CrossRefGoogle Scholar
  32. 32.
    Ryang H, Yun U (2016) High utility pattern mining over data streams with sliding window technique. Expert Syst Appl 57(C):214–231CrossRefGoogle Scholar
  33. 33.
    Shannon CE (1938) A mathematical theory of communication. Bell Syst Tech J 196(4):519–520Google Scholar
  34. 34.
    Street W (2001) A streaming ensemble algorithm (sea) for large-scale classification. In: ACM SIGKDD international conference on knowledge discovery & data mining, pp 377–382Google Scholar
  35. 35.
    Wang H, Yu P, Han J (2003) Mining concept-drifting data streams. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery & data mining, pp 226–235Google Scholar
  36. 36.
    Wei Q, Yang Z, Zhu J, Qiang Q (2009) Mining multi-label concept-drifting data streams using dynamic classifier ensemble. In: International conference on Fuzzy systems and knowledge discovery, pp 275–279Google Scholar
  37. 37.
    Wu X, Li P, Hu X (2012) Learning from concept drifting data streams with unlabeled data. Neurocomputing 92(3):145–155CrossRefGoogle Scholar
  38. 38.
    Xu S, Wang J (2016) A fast incremental extreme learning machine algorithm for data streams classification. Expert Syst Appl 65:332–344CrossRefGoogle Scholar
  39. 39.
    Xu S, Wang J (2017) Dynamic extreme learning machine for data stream classification. Neurocomputing 238:433–449CrossRefGoogle Scholar
  40. 40.
    Yang Z, Wu Q, Leung C, Miao C (2015) OS-ELM based emotion recognition for empathetic elderly companion. Proceedings of ELM-2014, vol 2. Springer, ChamGoogle Scholar
  41. 41.
    Zhai J, Wang J, Wang X (2014) Ensemble online sequential extreme learning machine for large data set classification. In: IEEE international conference on systems, man and cybernetics, pp 2250–2255Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  • Junhong Wang
    • 1
    Email author
  • Shuliang Xu
    • 1
  • Bingqian Duan
    • 1
  • Caifeng Liu
    • 2
  • Jiye Liang
    • 1
  1. 1.Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of EducationSchool of Computer and Information Technology, Shanxi UniversityTaiyuanChina
  2. 2.Faculty of Electronic Information and Electrical EngineeringDalian University of TechnologyDalianChina

Personalised recommendations