Abstract
Clustering method is one of the most important and basic technique for data mining which aims to group a collection of samples into clusters based on similarity. Clustering Big datasets has always been a serious challenge due to its high dimensionality and complexity. In this paper, we propose a novel clustering algorithm which aims to introduce the concept of intuitionistic fuzzy set theory onto the framework of CLARANS for handling uncertainty in the context of mining Big datasets. We also suggest a new scalable approximation to compute the maximum number of neighbors. Our experimental evaluation on real data sets shows that the proposed algorithm can obtain satisfactory clustering results and outperforms other current methods. The clusters quality was evaluated by three well-known metrics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aboubi, Y., Drias, H., Kamel, N.: BAT-CLARA: BAT-inspired algorithm for clustering LARge applications. In: 8th IFAC Conference on Manufacturing Modelling, Management and Control, MIM 2016, vol. 49, pp. 243–248 (2016)
Ng, R.T., Han, J.: CLARANS: a method for clustering objects for spatial data mining. IEEE Trans. Knowl. Data Eng. 14, 1003–1016 (2002)
Lorbeer, B., et al.: Variations on the clustering algorithm BIRCH. Big Data Res. 2214–5796 (2017)
Lathiya, P., Rani, R.: Improved CURE clustering for big data using Hadoop and Mapreduce. In: International Conference on Inventive Computation Technologies (ICICT), Coimbatore, India, pp. 1–5 (2016)
Rezaee, B.: A cluster validity index for fuzzy clustering. Fuzzy Sets Syst. 161, 3014–3025 (2010)
Dutta, M., Mahanta, A.K., Pujari, A.K.: QROCK: a quick version of the ROCK algorithm for clustering of categorical data. Pattern Recogn. Lett. 26, 2364–2373 (2005)
Mahesh Kumar, K., Rama Mohan Reddy, A.: A density based algorithm for discovering clusters in large spatial databases with noise. Pattern Recogn. 58, 39–48 (2016)
Ankerst, M., et al.: OPTICS: ordering points to identify clustering structure. In: Proceedings of ACM SIGMOD Conference on Management of Data. ACM Press, Philadelphia (1999)
Saxena, A., Prasad, M., Gupta, A., Bharill, N., Patel, O.P., Tiwari, A., Er, M.J., Ding, W., Lin, C.-T.: A review of clustering techniques and developments. Neurocomputing 267, 664–681 (2017)
Berkhin, P.: Survey of Clustering Data Mining techniques. Accrue Software Inc., San Jose (2000)
Yu, H., Zhi, X., Fan, J.: Image segmentation based on weak fuzzy partition entropy. Neurocomputing 168, 994–1010 (2015)
Bezdek, J.C. (ed.): Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)
Zadeh, L.A.: Outline of a new approach to the analysis of complex systems and decision processes. IEEE Trans. Syst. Man Cybern. 3, 28–44 (1973)
Deschrijver, G., Cornelis, C., Kerre, E.E.: On the representation of intuitionistic fuzzy t-norms and t-conorms. IEEE Trans. Fuzzy Syst. 12, 45–61 (2004)
Yuan, X., Li, H., Zhang, C.: The theory of intuitionistic fuzzy sets based on the intuitionistic fuzzy special sets. Inf. Sci. 277, 284–298 (2014)
Halkidi, M., Gunopulos, D., Vazirgiannis, M., et al.: A clustering framework based on subjective and objective validity criteria. ACM Trans. Knowl. Disc. Data 1(4), 1–25 (2008)
Zhang, H.-M., Xu, Z.-S., Chen, Q.: On clustering approach to intuitionistic fuzzy sets. Control Decis. 22, 882 (2007)
Dhillon, I., Guan, Y., Kulis, B.: Kernel k-means: spectral clustering and normalized cuts. In: Proceeding of KDD, Proceedings of 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 551–556 (2004)
Dhillon, I., Modha, D.: Concept decompositions for large sparse text data using clustering. Mach. Learn. 42(1–2), 143–175 (2001)
de Amorim, R.C., Mirkin, B.: Minkowski metric, feature weighting and anomalous cluster initializing in k-means clustering. Pattern Recogn. 45(3), 1061–1075 (2012)
Pelleg, D., Moore, A.W.: X-means: extending k-means with efficient estimation of the number of clusters. In: Proceedings of 17th International Conference on Machine Learning, pp. 727–734. Morgan Kaufmann (2000)
Cai, X., Nie, F., Huang, H.: Multi-view k-means clustering on big data. In: Rossi, F. (ed.) Proceedings of 23rd International Joint Conference on Artificial Intelligence, IJCAI 2013. IJCAI/AAAI (2013)
Mahesh Kumar, K., Rama Mohan Reddy, A.: An efficient k-means clustering filtering algorithm using density based initial cluster centers. Inf. Sci. 418, 286–301 (2017)
Klir, G.J., Yuan, B.: Fuzzy Sets and Fuzzy Logic Theory and Applications. Prentice Hall of India Private Limited, New Delhi (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Shili, H., Romdhane, L.B. (2018). IF-CLARANS: Intuitionistic Fuzzy Algorithm for Big Data Clustering. In: Medina, J., et al. Information Processing and Management of Uncertainty in Knowledge-Based Systems. Theory and Foundations. IPMU 2018. Communications in Computer and Information Science, vol 854. Springer, Cham. https://doi.org/10.1007/978-3-319-91476-3_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-91476-3_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91475-6
Online ISBN: 978-3-319-91476-3
eBook Packages: Computer ScienceComputer Science (R0)