Skip to main content

eRules: A Modular Adaptive Classification Rule Learning Algorithm for Data Streams

  • Conference paper
  • First Online:

Abstract

Advances in hardware and software in the past decade allow to capture, record and process fast data streams at a large scale. The research area of data stream mining has emerged as a consequence from these advances in order to cope with the real time analysis of potentially large and changing data streams. Examples of data streams include Google searches, credit card transactions, telemetric data and data of continuous chemical production processes. In some cases the data can be processed in batches by traditional data mining approaches. However, in some applications it is required to analyse the data in real time as soon as it is being captured. Such cases are for example if the data stream is infinite, fast changing, or simply too large in size to be stored. One of the most important data mining techniques on data streams is classification. This involves training the classifier on the data stream in real time and adapting it to concept drifts. Most data stream classifiers are based on decision trees. However, it is well known in the data mining community that there is no single optimal algorithm. An algorithm may work well on one or several datasets but badly on others. This paper introduces eRules, a new rule based adaptive classifier for data streams, based on an evolving set of Rules. eRules induces a set of rules that is constantly evaluated and adapted to changes in the data stream by adding new and removing old rules. It is different from the more popular decision tree based classifiers as it tends to leave data instances rather unclassified than forcing a classification that could be wrong. The ongoing development of eRules aims to improve its accuracy further through dynamic parameter setting which will also address the problem of changing feature domain values

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Computational intelligence platform for evolving and robust predictive systems, http://infer.eu/ 2012.

    Google Scholar 

  2. Brian Babcock, Shivnath Babu, Mayur Datar, Rajeev Motwani, and Jennifer Widom. Models and issues in data stream systems. In In PODS, pages 1–16, 2002.

    Google Scholar 

  3. Albert Bifet, Geoff Holmes, Richard Kirkby, and Bernhard Pfahringer. Moa: Massive online analysis. J. Mach. Learn. Res., 99:1601–1604, August 2010.

    Google Scholar 

  4. Albert Bifet, Geoff Holmes, Bernhard Pfahringer, Richard Kirkby, and Ricard Gavald`a. New ensemble methods for evolving data streams. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’09, pages 139–148, New York, NY, USA, 2009. ACM.

    Google Scholar 

  5. M A Bramer. Automatic induction of classification rules from examples using N-Prism. In Research and Development in Intelligent Systems XVI, pages 99–121, Cambridge, 2000. Springer-Verlag.

    Google Scholar 

  6. M A Bramer. An information-theoretic approach to the pre-pruning of classification rules. In B Neumann M Musen and R Studer, editors, Intelligent Information Processing, pages 201– 212. Kluwer, 2002.

    Google Scholar 

  7. Leo Breiman, Jerome Friedman, Charles J. Stone, and R. A. Olshen. Classification and Regression Trees. Chapman & Hall/CRC, 1 edition, January 1984.

    Google Scholar 

  8. J. Cendrowska. PRISM: an algorithm for inducing modular rules. International Journal of Man-Machine Studies, 27(4):349–370, 1987.

    Article  MATH  Google Scholar 

  9. Mayur Datar, Aristides Gionis, Piotr Indyk, and Rajeev Motwani. Maintaining stream statistics over sliding windows. In ACM-SIAM Symposium on Discrete Algorithms (SODA 2002), 2002.

    Google Scholar 

  10. Pedro Domingos and Geoff Hulten. Mining high-speed data streams. In Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’00, pages 71–80, New York, NY, USA, 2000. ACM.

    Google Scholar 

  11. Pedro Domingos and Geoff Hulten. A general framework for mining massive data stream. Journal of Computational and Graphical Statistics, 12:2003, 2003.

    Google Scholar 

  12. Mohamed Medhat Gaber. Advances in data stream mining. Wiley Interdisciplinary Reviews Data Mining and Knowledge Discovery, 2(1):79–85, 2012.

    Google Scholar 

  13. Mohamed Medhat Gaber, Arkady Zaslavsky, and Shonali Krishnaswamy. Mining data streams: a review. SIGMOD Rec., 34(2):18–26, 2005.

    Google Scholar 

  14. Jo˜ao Gama, Raquel Sebasti˜ao, and Pedro Pereira Rodrigues. Issues in evaluation of stream learning algorithms. In Proceedings of the 15th ACM SIGKDD international conference onKnowledge discovery and data mining, KDD ’09, pages 329–338, New York, NY, USA, 2009. ACM.

    Google Scholar 

  15. Jiawei Han and Micheline Kamber. Data Mining: Concepts and Techniques. Morgan Kaufmann 2001.

    Google Scholar 

  16. Petr Kadlec and Bogdan Gabrys. Architecture for development of adaptive on-line prediction models. Memetic Computing, 1:241–269, 2009.

    Article  Google Scholar 

  17. J. Zico Kolter and Marcus A. Maloof. Dynamic weighted majority: An ensemble method for drifting concepts. J. Mach. Learn. Res., 8:2755–2790, December 2007.

    Google Scholar 

  18. Ross J Quinlan. Induction of decision trees. Machine Learning, 1(1):81–106, 1986.

    Google Scholar 

  19. P. Smyth and R M Goodman. An information theoretic approach to rule induction from databases. 4(4):301–316, 1992.

    Google Scholar 

  20. F. Stahl and M. Bramer. Towards a computationally efficient approach to modular classification rule induction. Research and Development in Intelligent Systems XXIV, pages 357–362, 2008.

    Google Scholar 

  21. F. Stahl and M. Bramer. Computationally efficient induction of classification rules with the pmcri and j-pmcri frameworks. Knowledge-Based Systems, 2012.

    Google Scholar 

  22. F. Stahl and M. Bramer. Jmax-pruning: A facility for the information theoretic pruning of modular classification rules. Knowledge-Based Systems, 29(0):12 – 19, 2012.

    Article  Google Scholar 

  23. W. Nick Street and YongSeog Kim. A streaming ensemble algorithm (sea) for large-scale classification. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’01, pages 377–382, New York, NY, USA, 2001. ACM.

    Google Scholar 

  24. Periasamy Vivekanandan and Raju Nedunchezhian. Mining data streams with concept drifts busing genetic algorithm. Artif. Intell. Rev., 36(3):163–178, October 2011.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Frederic Stahl .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag London

About this paper

Cite this paper

Stahl, F., Gaber, M.M., Salvador, M.M. (2012). eRules: A Modular Adaptive Classification Rule Learning Algorithm for Data Streams. In: Bramer, M., Petridis, M. (eds) Research and Development in Intelligent Systems XXIX. SGAI 2012. Springer, London. https://doi.org/10.1007/978-1-4471-4739-8_5

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-4739-8_5

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-4738-1

  • Online ISBN: 978-1-4471-4739-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics