Abstract
An important open issue in KDD research is the reveal and the handling of uncertainty. The popular classification approaches do not take into account this feature while they do not exploit properly the significant amount of information included in the results of classification process (i.e., classification scheme), though it will be useful in decision-making. In this paper we present a framework that maintains uncertainty throughout the classification process by maintaining the classification belief and moreover enables assignment of an item to multiple classes with a different belief. Decision support tools are provided for decisions related to: i. relative importance of classes in a data set (i.e., “young vs. old customers”), ii. relative importance of classes across data sets iii. the information content of different data sets. Finally we provide a mechanism for evaluating classification schemes and select the scheme that best fits the data under consideration.
In this paper we use the terms “classes” and “clusters” interchangeably.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
M. Berry, G. Linoff. Data Mining Techniques For marketing, Sales and Customer Support. John Willey & Sons, Inc, 1996.
S. Chiu. “Extracting Fuzzy Rules from Data for Function Approximation and Pattern Classification”. Fuzzy Information Engineering-A Guided Tour of Applications.(Eds.: D. Dubois, H. Prade, R Yager), 1997
P. Cheeseman, J. Stutz. “Bayesian Classification (AutoClass): Theory and Results rd. Advances in Knowledge Discovery and Data Mining. (Eds:U. Fayyad,et al), AAAI Press, 1996.
U. Fayyad, G. Piatesky-Shapiro, P. Smuth & R. Uthurusamy(editors). “From DataMining to Knowledge Discovery: An Overview”. Advances in Knowledge Discovery and Data Mining. AAAI Press, 1996.
M. Gupta, and T. Yamakawa, (eds). “Fuzzy Logic and Knowledge Based Systems”, Decision and Control (North Holland). 1988.
M. Halkidi, M. Vazirgiannis. Clustering: Quality measures and uncertainty handling. Technical report, Athens Univ. of Economic & Business, 1999
T. Horiuchi. “Decision Rule for Pattern Classification by Integrating Interval Feature Values”. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.20, No.4, April 1998, pp.440–448.
W. Kelly, J. Painter. “Hypertrazoidal Membership Functions”. 5th IEEE International Conference on Fuzzy Systems, New Orleans, September 8, 1996.
M. Melta, R. Agrawal, J. Rissanen. “SLIQ: A fast scalable classifier for data mining”. In EDBT’96, Avigon France, March 1996.
T. Mitchell. Machine Learning. McGraw-Hill, 1997
J.R Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufman, 1993.
R. Rastori, K. Shim. “PUBLIC: A Decision Tree Classifier that Integrates Building and Pruning”. Proceeding of the 24th VLDB Conference, New York, USA, 1998.
J. Shafer, R. Agrawal, M. Mehta. “SPRINT: A scalable parallel classifier for data mining”. In Proc. of the VLDB Conference, Bombay, India, September 1996
Glymour C., Madigan D., Pregibon D, Smyth P, “Statistical Inference and Data Mining”, in CACM v39 (11), 1996, pp. 35–42
Cezary Z. Janikow, “Fuzzy Decision Trees: Issues and Methods”, IEEE Transactions on Systems, Man, and Cybernetics, Vol. 28, Issue 1, pp 1–14, 1998.
M. Vazirgiannis, “A classification and relationship extraction scheme for relational databases based on fuzzy logic”, in the proceedings of the Pacific-Asian Knowledge Discovery & Data Mining’ 98 Conference, Melbourne, Australia, 1999.
S. Theodoridis, K. Koutroubas. Pattern recognition, Academic Press, 1999
Bezdeck J.C, Ehrlich R., Full W., “FCM:Fuzzy C-Means Algorithm”, Computers and Geoscience 1984
M. Vazirgiannis, M. Halkidi. “Uncertainty handling in the datamining process with fuzzy logic”, to appear in the proceedings of the IEEE-FUZZ conference, San Antonio, May, 2000.
T. Shneider. “Information Theory Primer”, Chapter II, PhD thesis: “The information Content of Binding Sites on Nucleotide Sequences”. http://www.lecb.ncifcrf.gov/~toms/paper/primer/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Halkidi, M., Vazirgiannis, M. (2002). Managing Uncertainty and Quality in the Classification Process. In: Vlahavas, I.P., Spyropoulos, C.D. (eds) Methods and Applications of Artificial Intelligence. SETN 2002. Lecture Notes in Computer Science(), vol 2308. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46014-4_25
Download citation
DOI: https://doi.org/10.1007/3-540-46014-4_25
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43472-6
Online ISBN: 978-3-540-46014-5
eBook Packages: Springer Book Archive