Abstract
In concept learning or data mining tasks, the learner is typically faced with a choice of many possible hypotheses characterizing the data. If one can assume that the training data are noise-free, then the generated hypothesis should be complete and consistent with regard to the data. In real-world problems, however, data are often noisy, and an insistence on full completeness and consistency is no longer valid. The problem then is to determine a hypothesis that represents the “best” trade-off between completeness and consistency. This paper presents an approach to this problem in which a learner seeks rules optimizing a description quality criterion that combines completeness and consistency gain, a measure based on consistency that reflects the rule’s benefit. The method has been implemented in the AQ18 learning and data mining system and compared to several other methods. Experiments have indicated the flexibility and power of the proposed method.
This research was supported in part by the National Science Foundation under grants No. NSF 9904078 and IRI-9510644.
Preview
Unable to display preview. Download preview PDF.
References
Baim, P.W.: The PROMISE Method for Selecting Most Relevant Attributes for Inductive Learning Systems. Report No. UIUCDCS-F-82-898. Department of Computer Science, University of Illinois, Urbana (1982)
Bergadano, F., Matwin, S., Michalski R.S., Zhang, J.: Learning Two-tiered Descriptions of Flexible Concepts: The POSEIDON System. Machine Learning 8 (1992) 5–43
Clark, P., Niblett T.: The CN2 Induction Algorithm. Machine Learning 3 (1989) 261–283.
Cohen, W.: Fast Effective Rule Induction. Proc. 12th Intl. Conf. on Machine Learning (1995)
Fürnkranz, J., Widmer, G.: Incremental Reduced Error Pruning. Proc. 11th Intl. Conf. on Machine Learning (1994)
Kaufman, K.A.: INLEN: A Methodology and Integrated System for Knowledge Discovery in Databases. Ph.D. diss. George Mason University, Fairfax, VA (1997)
Kaufman, K.A., Michalski, R.S.: Learning in an Inconsistent World: Rule Selection in AQ18. Reports of the Machine Learning and Inference Laboratory. MLI 99-1. George Mason University, Fairfax, VA (1999)
Michalski, R.S.: A Theory and Methodology of Inductive Learning. In: Michalski, R.S., Carbonell, J.G., Mitchell, T.M. (eds.) Machine Learning: An Artificial Intelligence Approach. Tioga Publishing, Palo Alto, CA (1983) 83–129
Michalski, R.S.: NATURAL INDUCTION: A Theory and Methodology of the STAR Approach to Symbolic Learning and Its Application to Machine Learning and Data Mining. Reports of the Machine Learning and Inference Laboratory. George Mason University (1999)
Piatetsky-Shapiro, G.: Discovery, Analysis, and Presentation of Strong Rules. In: Piatetsky-Shapiro, G., Frawley, W. (eds.): Knowledge Discovery in Databases. AAAI Press, Menlo Park, CA (1991) 229–248
Quinlan, J.R.: Induction of Decision Trees. Machine Learning 1 (1986) 81–106
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kaufman, K.A., Michalski, R.S. (1999). Learning from inconsistent and noisy data: The AQ18 approach. In: Raś, Z.W., Skowron, A. (eds) Foundations of Intelligent Systems. ISMIS 1999. Lecture Notes in Computer Science, vol 1609. Springer, Berlin, Heidelberg . https://doi.org/10.1007/BFb0095128
Download citation
DOI: https://doi.org/10.1007/BFb0095128
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65965-5
Online ISBN: 978-3-540-48828-6
eBook Packages: Springer Book Archive