Abstract
The accuracy of the rules produced by a concept learning system can be hindered by the presence of errors in the data. Although these errors are most commonly attributed to random noise, there also exist “ill-defined” attributes that are too general or too specific that can produce systematic classification errors. We present a computer program called Newton which uses the fact that ill-defined attributes create an ordered error pattern among the instances to compute hypotheses explaining the classification errors of a concept in terms of too general or too specific attributes. Extensive empirical testing shows that Newton identifies such attributes with a prediction rate over 95%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Quinlan, J.R.: The effect of noise on concept learning. In: Michalski, R.S., Carbonell, J.G., Mitchell, T.M. (eds.) Machine Learning: An artificial intelligence approach, vol. 2. Morgan Kaufmann, San Francisco (1986)
Thayse, A.: Boolean calculus of differences. In: Shavlik, J.W., Dietterich, T.G. (eds.) Boolean Calculus of Differences. LNCS, vol. 101. Springer, Heidelberg (1981)
Mooney, R., Ourston, D.: Theory refinement with noisy data. Technical Report AI 91153, Artificial Intelligence Lab, University of Texas at Austin (1991)
Quinlan, J.R.: Induction of decision trees. In: Shavlik, J.W., Dietterich, T.G. (eds.) Readings in Machine Learning, Morgan Kaufmann, San Francisco (1990); Originally published in Machine Learning 1, 81–106 (1986)
Kearns, M.J., Li, M.: Learning in the presence of malicious errors (extended abstract). In: STOC, pp. 267–280 (1988)
Dietterich, T.G.: An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning 40(2), 139–157 (2000)
Kalapanidas, E., Avouris, N., Cracium, M., Neagu, D.: Machine learning algorithms: a study on noise sensivity. In: Manolopoulos, Y., Spirakis, P. (eds.) BCI 2003, pp. 356–365 (2003)
Brazdil, P., Clark, P.: Learning from imperfect data. In: Brazdil, P., Konolige, K. (eds.) Machine Learning, Meta-Reasoning and Logics, pp. 207–232. Kluwer, Boston (1990)
Weiss, G.M.: Learning with rare cases and small disjuncts. In: International Conference on Machine Learning, pp. 558–565 (1995)
Provost, F.J., Danyluk, A.P.: Learning from bad data. In: ML 1995 Workshop on Applying Machine Learning in Practice (1995)
Mingers, J.: An empirical comparison of pruning methods for decision tree induction. Machine Learning 4, 227–243 (1989)
Hennessy, J.L., Patterson, D.A.: Computer Organization and Design: The Hardware/Software Interface. Morgan Kaufmann, San Francisco (2004)
Bay, S.D., Pazzani, M.J.: Characterizing model errors and differences. In: Langley, P. (ed.) ICML, pp. 49–56. Morgan Kaufmann, San Francisco (2000)
Liu, B., Hsu, W., Ma, Y.: Discovering the set of fundamental rule changes. In: Knowledge Discovery and Data Mining, pp. 335–340 (2001)
Wang, K., Zhou, S., Fu, A.W.C., Yu, J.X.: Mining changes of classification by correspondence tracing. In: Barbará, D., Kamath, C. (eds.) SDM, SIAM (2003)
Kukar, M., Kononenko, I.: Reliable classifications with machine learning. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) ECML 2002. LNCS (LNAI), vol. 2430, pp. 219–231. Springer, Heidelberg (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hallé, S. (2005). Using Boolean Differences for Discovering Ill-Defined Attributes in Propositional Machine Learning. In: Gelbukh, A., de Albornoz, Á., Terashima-Marín, H. (eds) MICAI 2005: Advances in Artificial Intelligence. MICAI 2005. Lecture Notes in Computer Science(), vol 3789. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11579427_43
Download citation
DOI: https://doi.org/10.1007/11579427_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29896-0
Online ISBN: 978-3-540-31653-4
eBook Packages: Computer ScienceComputer Science (R0)