eRules: A Modular Adaptive Classification Rule Learning Algorithm for Data Streams

Stahl, Frederic; Gaber, Mohamed Medhat; Salvador, Manuel Martin

doi:10.1007/978-1-4471-4739-8_5

eRules: A Modular Adaptive Classification Rule Learning Algorithm for Data Streams

Frederic Stahl³,
Mohamed Medhat Gaber⁴ &
Manuel Martin Salvador³

Conference paper
First Online: 01 January 2012

888 Accesses
4 Citations
1 Altmetric

Abstract

Advances in hardware and software in the past decade allow to capture, record and process fast data streams at a large scale. The research area of data stream mining has emerged as a consequence from these advances in order to cope with the real time analysis of potentially large and changing data streams. Examples of data streams include Google searches, credit card transactions, telemetric data and data of continuous chemical production processes. In some cases the data can be processed in batches by traditional data mining approaches. However, in some applications it is required to analyse the data in real time as soon as it is being captured. Such cases are for example if the data stream is infinite, fast changing, or simply too large in size to be stored. One of the most important data mining techniques on data streams is classification. This involves training the classifier on the data stream in real time and adapting it to concept drifts. Most data stream classifiers are based on decision trees. However, it is well known in the data mining community that there is no single optimal algorithm. An algorithm may work well on one or several datasets but badly on others. This paper introduces eRules, a new rule based adaptive classifier for data streams, based on an evolving set of Rules. eRules induces a set of rules that is constantly evaluated and adapted to changes in the data stream by adding new and removing old rules. It is different from the more popular decision tree based classifiers as it tends to leave data instances rather unclassified than forcing a classification that could be wrong. The ongoing development of eRules aims to improve its accuracy further through dynamic parameter setting which will also address the problem of changing feature domain values

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Computational intelligence platform for evolving and robust predictive systems, http://infer.eu/ 2012.
Google Scholar
Brian Babcock, Shivnath Babu, Mayur Datar, Rajeev Motwani, and Jennifer Widom. Models and issues in data stream systems. In In PODS, pages 1–16, 2002.
Google Scholar
Albert Bifet, Geoff Holmes, Richard Kirkby, and Bernhard Pfahringer. Moa: Massive online analysis. J. Mach. Learn. Res., 99:1601–1604, August 2010.
Google Scholar
Albert Bifet, Geoff Holmes, Bernhard Pfahringer, Richard Kirkby, and Ricard Gavald`a. New ensemble methods for evolving data streams. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’09, pages 139–148, New York, NY, USA, 2009. ACM.
Google Scholar
M A Bramer. Automatic induction of classification rules from examples using N-Prism. In Research and Development in Intelligent Systems XVI, pages 99–121, Cambridge, 2000. Springer-Verlag.
Google Scholar
M A Bramer. An information-theoretic approach to the pre-pruning of classification rules. In B Neumann M Musen and R Studer, editors, Intelligent Information Processing, pages 201– 212. Kluwer, 2002.
Google Scholar
Leo Breiman, Jerome Friedman, Charles J. Stone, and R. A. Olshen. Classification and Regression Trees. Chapman & Hall/CRC, 1 edition, January 1984.
Google Scholar
J. Cendrowska. PRISM: an algorithm for inducing modular rules. International Journal of Man-Machine Studies, 27(4):349–370, 1987.
Article MATH Google Scholar
Mayur Datar, Aristides Gionis, Piotr Indyk, and Rajeev Motwani. Maintaining stream statistics over sliding windows. In ACM-SIAM Symposium on Discrete Algorithms (SODA 2002), 2002.
Google Scholar
Pedro Domingos and Geoff Hulten. Mining high-speed data streams. In Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’00, pages 71–80, New York, NY, USA, 2000. ACM.
Google Scholar
Pedro Domingos and Geoff Hulten. A general framework for mining massive data stream. Journal of Computational and Graphical Statistics, 12:2003, 2003.
Google Scholar
Mohamed Medhat Gaber. Advances in data stream mining. Wiley Interdisciplinary Reviews Data Mining and Knowledge Discovery, 2(1):79–85, 2012.
Google Scholar
Mohamed Medhat Gaber, Arkady Zaslavsky, and Shonali Krishnaswamy. Mining data streams: a review. SIGMOD Rec., 34(2):18–26, 2005.
Google Scholar
Jo˜ao Gama, Raquel Sebasti˜ao, and Pedro Pereira Rodrigues. Issues in evaluation of stream learning algorithms. In Proceedings of the 15th ACM SIGKDD international conference onKnowledge discovery and data mining, KDD ’09, pages 329–338, New York, NY, USA, 2009. ACM.
Google Scholar
Jiawei Han and Micheline Kamber. Data Mining: Concepts and Techniques. Morgan Kaufmann 2001.
Google Scholar
Petr Kadlec and Bogdan Gabrys. Architecture for development of adaptive on-line prediction models. Memetic Computing, 1:241–269, 2009.
Article Google Scholar
J. Zico Kolter and Marcus A. Maloof. Dynamic weighted majority: An ensemble method for drifting concepts. J. Mach. Learn. Res., 8:2755–2790, December 2007.
Google Scholar
Ross J Quinlan. Induction of decision trees. Machine Learning, 1(1):81–106, 1986.
Google Scholar
P. Smyth and R M Goodman. An information theoretic approach to rule induction from databases. 4(4):301–316, 1992.
Google Scholar
F. Stahl and M. Bramer. Towards a computationally efficient approach to modular classification rule induction. Research and Development in Intelligent Systems XXIV, pages 357–362, 2008.
Google Scholar
F. Stahl and M. Bramer. Computationally efficient induction of classification rules with the pmcri and j-pmcri frameworks. Knowledge-Based Systems, 2012.
Google Scholar
F. Stahl and M. Bramer. Jmax-pruning: A facility for the information theoretic pruning of modular classification rules. Knowledge-Based Systems, 29(0):12 – 19, 2012.
Article Google Scholar
W. Nick Street and YongSeog Kim. A streaming ensemble algorithm (sea) for large-scale classification. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’01, pages 377–382, New York, NY, USA, 2001. ACM.
Google Scholar
Periasamy Vivekanandan and Raju Nedunchezhian. Mining data streams with concept drifts busing genetic algorithm. Artif. Intell. Rev., 36(3):163–178, October 2011.
Google Scholar

Download references

Author information

Authors and Affiliations

School of Design, Engineering and Computing, Bournemouth University, Poole House, Talbot Campus, BH12 5BB, Poole, USA
Frederic Stahl & Manuel Martin Salvador
School of Computing, Buckingham Building, University of Portsmouth, Lion Terrace, PO1 3HE, USA
Mohamed Medhat Gaber

Authors

Frederic Stahl
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Medhat Gaber
View author publications
You can also search for this author in PubMed Google Scholar
Manuel Martin Salvador
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Frederic Stahl .

Editor information

Editors and Affiliations

School of Computing, University of Portsmouth, Whitepost Lane The Lilacs, Portsmouth, PO1 3AH, Hampshire, United Kingdom
Max Bramer
School of Computing, Engineering & Mathe, University of Brighton, Lewes Road, Brighton, BN2 4GJ, West Sussex, United Kingdom
Miltos Petridis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Stahl, F., Gaber, M.M., Salvador, M.M. (2012). eRules: A Modular Adaptive Classification Rule Learning Algorithm for Data Streams. In: Bramer, M., Petridis, M. (eds) Research and Development in Intelligent Systems XXIX. SGAI 2012. Springer, London. https://doi.org/10.1007/978-1-4471-4739-8_5

Download citation

DOI: https://doi.org/10.1007/978-1-4471-4739-8_5
Published: 09 October 2012
Publisher Name: Springer, London
Print ISBN: 978-1-4471-4738-1
Online ISBN: 978-1-4471-4739-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics