Abstract
The main concept of rough sets theory is clustering similarities of objects based on the notions of indiscernibility relation. In this paper, we develop the concept of indiscernibility level of rough set theory as an additional measurement for hierarchical clustering. The combination between indiscernibility (quantitative indiscernibility relation) and indiscernibility level are used as a new method for hierarchical clustering. The indiscernibility level quantifies the indiscernibility of pairs of objects among other objects in information system. For comparison, the following four clustering methods were selected and evaluated on a simulation data set : average-, complete- and single-linkage agglomerative hierarchical clustering and Ward’s method. The simulation shows that the hierarchical clustering yields dendrogram instability that gives different solutions under permutations of input order of data objects. The result of this paper shows that the new method plays an important role in clustering information system and compared to other method, clustering based on indiscernibility and its indiscernibility level reduces the dendrogram instability.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Hao, Y., Yao, Y., Luo, F.: Data Analysis Based on Discernibility and Indiscernibility. Journal of Information Science 177, 4959–4976 (2007)
Hakim, R.B., Fajriya, S.: Clustering Based on Indiscernibility and Indiscernibility Level. In: Proceeding 2009 IEEE International Conference on Granular Computing, pp. 192–196 (2009) ISBN : 978-1-4244-4830-2
Hirano, S., Sun, X., Tsumoto, S.: Comparison of Clustering Methods for Clinical Databases. Information Sciences 159, 155–165 (2004)
Mingoti, S.A., Lima, J.O.: Comparing SOM Neural Network with Fuzzy C-Means, K-Means and Traditional Hierarchical Clustering Algorithms. European Journal of Operational Research 174, 1742–1759 (2006)
Budayan, C., Dikmen, I., Talat Birgonul, M.: Comparing the Performance of Traditional Cluster Analysis, Self Organizing Maps and Fuzzy C-Means Method for Strategic Grouping. Expert System with Applications 36, 11772–11781 (2009)
Finch, H.: Comparison of Distance Measures in Cluster Analysis with Dichotomous Data. Journal of Data Science 3, 85–100 (2005)
Hitchcock, D.B., Chen, Z.: Smoothing Dissimilarities to Cluster Binary Data. Computational Statistics and Data Analysis 52, 4699–4711 (2008)
Van Der Kloot, W. A., Alexander M. J., Spaans, M.J., Heiser, W.J.: Instability of Hierarchical Cluster Analysis Due to Input Order of the Data: The PermuCluster Solution. Psychological Methods 10(4), 468–476 (2005)
Hardle, W., Simar, L.: Applied Multivariate Statistical Analysis. Springer, Heidelberg (2007)
Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning About Data. Kluwer Academic Publishers, Boston (1991)
Pawlak, Z., Skowron, A.: Rudiments of rough sets. Information Sciences 177, 3–27 (2007)
Pawlak, Z., Skowron, A.: Rough Sets: Some Extensions. Information Sciences 177, 28–40 (2007)
Parmar, D., Wu, T., Blackhurst, J.: MMR: An Algorithm for clustering categorical data using Rough Set Theory. Data & Knowledge Engineering 63, 879–893 (2007)
Kumar, P., Radha Krishna, P., Bapi, R.S., De Kumar, S.: Rough Clustering of sequential data. Journal of Data & Knowledge Engineering 63, 183–199 (2007)
Upadhyaya, S., Arora, A., Jain, R.: Rough Set Theory: Approach for Similarity Measure in Cluster Analysis. In: Proceeding of the 2006 International Conference on Data Mining, DMIN 2006, Las Vegas, Nevada USA,(2006) ISBN 1 60132-004-3
Hirano, S., Tsumoto, S.: An indiscernibility-based clustering method with iterative refinement of equivalence relations –rough clustering. Advanced Computational Intelligence and Intelligent Informatics 7(2), 169–177 (2003)
Hirano, S., Tsumoto, S.: Indiscernibility-based clustering: Rough clustering. In: De Baets, B., Kaynak, O., Bilgiç, T. (eds.) IFSA 2003. LNCS, vol. 2715, pp. 378–386. Springer, Heidelberg (2003)
Hirano, S., Sun, X., Tsumoto, S.: Comparison of Clustering Methods for Clinical Databases. Information Sciences 159, 155–165 (2004)
Hirano, S., Tsumoto, S.: Hierarchical Clustering of Non-Euclidean Relational Data using Indiscernibility-Level. In: Wang, G., Li, T., Grzymala-Busse, J.W., Miao, D., Skowron, A., Yao, Y. (eds.) RSKT 2008. LNCS (LNAI), vol. 5009, pp. 332–339. Springer, Heidelberg (2008)
Hirano, S., Tsumoto, S.: Indiscernibility-based Clustering of Non- Euclidean Relational Data, http://www.ecmlpkdd2007.org/CD/workshops/RSKD
Kaufman, L., Rousseeuw, P.J.: Clustering by Means of Medoids. In: Dodge, Y. (ed.) Statistical Data Analysis Based on the L1 Norm, North Holland, Amsterdam, pp. 405–416 (1987)
R Development Core Team, R : A language and environment for statistical computing, R Foundation for Statistical Computing, Vienna, Austria (2006), http://www.r-project.org/
Pawlak, Z.: A Primer on Rough sets: A New Approach to Drawing Conclusions from Data. Cardozo Law Review, Vol 22( 5 -6 ), 1407–1415 (2001)
Rand, W.M.: Objective criteria for the evaluation of clustering methods. J. Amer. Statist. Assoc. 66, 846–850 (1971)
Demri, S.P., Orlowska, A.E.S.: Incomplete Information: Structure, Inference, Complexity. Springer, Heidelberg (2002)
Leisch, F., Weingessel, A., Hornik, K.: Bindata: Generation of Artificial Binary Data, R package version 0.9-12 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hakim, R.B.F., Subanar, Winarko, E. (2010). The Concept of Indiscernibility Level of Rough Set to Reduce the Dendrogram Instability. In: Zhang, Y., Cuzzocrea, A., Ma, J., Chung, Ki., Arslan, T., Song, X. (eds) Database Theory and Application, Bio-Science and Bio-Technology. BSBT DTA 2010 2010. Communications in Computer and Information Science, vol 118. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17622-7_21
Download citation
DOI: https://doi.org/10.1007/978-3-642-17622-7_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17621-0
Online ISBN: 978-3-642-17622-7
eBook Packages: Computer ScienceComputer Science (R0)