Abstract
This paper presents an algorithm that seeks every possible exception rule which violates a common sense rule and satisfies several assumptions of simplicity. Exception rules, which represent systematic deviation from common sense rules, are often found interesting. Discovery of pairs that consist of a common sense rule and an exception rule, resulting from undirected search for unexpected exception rules, was successful in various domains. In the past, however, an exception rule represented a change of conclusion caused by adding an extra condition to the premise of a common sense rule. That approach formalized only one type of exceptions, and failed to represent other types. In order to provide a systematic treatment of exceptions, we categorize exception rules into eleven categories, and we propose a unified algorithm for discovering all of them. Preliminary results on fifteen real-world data sets provide an empirical proof of effectiveness of our algorithm in discovering interesting knowledge. The empirical results also match our theoretical analysis of exceptions, showing that the eleven types can be partitioned in three classes according to the frequency with which they occur in data.
Chapter PDF
References
R. Agrawal, H. Mannila, R. Srikant et al:: Fast Discovery of Association Rules, Advances in Knowledge Discovery and Data Mining, AAAI Press, Menlo Park, Calif., pp. 307–328 (1996)
C.L. Blake and C.J. Merz: “UCI Repository of Machine Learning Databases”, http://www.ics.uci.edu/~mlearn/MLRepository.html , Dept. of Information and Computer Sci., Univ. of California Irvine (1998).
J. Dougherty, R. Kohavi, and M. Sahami: Supervised and Unsupervised Discretization of Continuous Features, in Proc. Twelfth Int’l Conf. Machine Learning (ICML), pp. 194–202 (1995).
E.M. Knorr and R.T. Ng: Algorithms for Mining Distance-Based Outliers in Large Datasets, in Proc. 24th Ann. Int’l Conf. Very Large Data Bases (VLDB), pp. 392–403 (1998).
T.M. Mitchell: “Machine Learning and Data Mining”, CACM, Vol. 42, No. 11, pp. 31–36 (1999).
B. Padmanabhan and A. Tuzhilin: “A Belief-Driven Method for Discovering Unexpected Patterns”, Proc. Fourth Int’l Conf. Knowledge Discovery and Data Mining (KDD), AAAI Press, Menlo Park, Calif., pp. 94–100 (1998).
S. Sarawagi: Explaining Differences in Multidimensional Aggregates, in Proc. 25th Int’l Conf. Very Large Data Bases (VLDB), pp. 42–53 (1999).
A. Silberschatz and A. Tuzhilin: “What Makes Patterns Interesting in Knowledge Discovery Systems”, IEEE Trans. Knowledge and Data Eng., Vol. 8, No. 6, pp. 970–974 (1996).
P. Smyth and R.M. Goodman: “An Information Theoretic Approach to Rule Induction from Databases”, IEEE Trans. Knowledge and Data Eng., Vol. 4, No. 4, pp. 301–316 (1992).
E. Suzuki and M. Shimura: Exceptional Knowledge Discovery in Databases Based on Information Theory, Proc. Second Int’l Conf. Knowledge Discovery and Data Mining (KDD), AAAI Press, Menlo Park, Calif., pp. 275–278 (1996).
E. Suzuki: “Autonomous Discovery of Reliable Exception Rules”, Proc. Third Int’l Conf. Knowledge Discovery and Data Mining (KDD), AAAI Press, Menlo Park, Calif., pp. 259–262 (1997).
E. Suzuki and Y. Kodratoff: “Discovery of Surprising Exception Rules based on Intensity of Implication”, Principles of Data Mining and Knowledge Discovery, LNAI 1510(PKDD), Springer, Berlin, pp. 10–18 (1998).
E. Suzuki: “Scheduled Discovery of Exception Rules”, Discovery Science, LNAI 1721 (DS), Springer, Berlin, pp. 184–195 (19
E. Suzuki and S. Tsumoto: “Evaluating Hypothesis-Driven Exception-Rule Discovery with Medical Data Sets”, iKnowledge Discovery and Data Mining, LNAI 1805(PAKDD), Springer, Berlin, pp. 86–97 (2000).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Suzuki, E., Żytkow, J.M. (2000). Unified Algorithm for Undirected Discovery of Exception Rules. In: Zighed, D.A., Komorowski, J., Żytkow, J. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 2000. Lecture Notes in Computer Science(), vol 1910. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45372-5_17
Download citation
DOI: https://doi.org/10.1007/3-540-45372-5_17
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41066-9
Online ISBN: 978-3-540-45372-7
eBook Packages: Springer Book Archive