Abstract
In this paper, we focus our discussion on the rough set approach for categorical data clustering. We propose MADE (Maximal Attributes Dependency), an alternative technique for categorical data clustering using rough set theory taking into account maximal attributes dependencies. Experimental results on two benchmark UCI datasets show that MADE technique is better with the baseline categorical data clustering techniques with respect to computational complexity and clusters purity.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Huang, Z.: Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Mining and Knowledge Discovery 2(3), 283–304 (1998)
Kim, D., Lee, K., Lee, D.: Fuzzy clustering of categorical data using fuzzy centroids. Pattern Recognition Letters 25(11), 1263–1271 (2004)
Pawlak, Z.: Rough sets. International Journal of Computer and Information Science 11, 341–356 (1982)
Mazlack, L.J., He, A., Zhu, Y., Coppock, S.: A rough set approach in choosing partitioning attributes. In: Proceedings of the ISCA 13th, International Conference, CAINE 2000, pp. 1–6 (2000)
Parmar, D., Wu, T., Blackhurst, J.: MMR: An algorithm for clustering categorical data using rough set theory. Data and Knowledge Engineering 63, 879–893 (2007)
Pawlak, Z., Skowron, A.: Rudiments of rough sets. International Journal Information Sciences 177(1), 3–27 (2007)
Herawan, T., Mustafa, M.D.: Rough set theory for selecting clustering attribute. In: Manuscript accepted at PCO 2009, Bali Indonesia (2009) (to appear in AIP)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Herawan, T., Yanto, I.T.R., Mat Deris, M. (2009). Rough Set Approach for Categorical Data Clustering. In: Ślęzak, D., Kim, Th., Zhang, Y., Ma, J., Chung, Ki. (eds) Database Theory and Application. DTA 2009. Communications in Computer and Information Science, vol 64. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10583-8_21
Download citation
DOI: https://doi.org/10.1007/978-3-642-10583-8_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10582-1
Online ISBN: 978-3-642-10583-8
eBook Packages: Computer ScienceComputer Science (R0)