Skip to main content

Abstract

Clustering method is one of the most important and basic technique for data mining which aims to group a collection of samples into clusters based on similarity. Clustering Big datasets has always been a serious challenge due to its high dimensionality and complexity. In this paper, we propose a novel clustering algorithm which aims to introduce the concept of intuitionistic fuzzy set theory onto the framework of CLARANS for handling uncertainty in the context of mining Big datasets. We also suggest a new scalable approximation to compute the maximum number of neighbors. Our experimental evaluation on real data sets shows that the proposed algorithm can obtain satisfactory clustering results and outperforms other current methods. The clusters quality was evaluated by three well-known metrics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aboubi, Y., Drias, H., Kamel, N.: BAT-CLARA: BAT-inspired algorithm for clustering LARge applications. In: 8th IFAC Conference on Manufacturing Modelling, Management and Control, MIM 2016, vol. 49, pp. 243–248 (2016)

    Article  Google Scholar 

  2. Ng, R.T., Han, J.: CLARANS: a method for clustering objects for spatial data mining. IEEE Trans. Knowl. Data Eng. 14, 1003–1016 (2002)

    Article  Google Scholar 

  3. Lorbeer, B., et al.: Variations on the clustering algorithm BIRCH. Big Data Res. 2214–5796 (2017)

    Google Scholar 

  4. Lathiya, P., Rani, R.: Improved CURE clustering for big data using Hadoop and Mapreduce. In: International Conference on Inventive Computation Technologies (ICICT), Coimbatore, India, pp. 1–5 (2016)

    Google Scholar 

  5. Rezaee, B.: A cluster validity index for fuzzy clustering. Fuzzy Sets Syst. 161, 3014–3025 (2010)

    Article  MathSciNet  Google Scholar 

  6. Dutta, M., Mahanta, A.K., Pujari, A.K.: QROCK: a quick version of the ROCK algorithm for clustering of categorical data. Pattern Recogn. Lett. 26, 2364–2373 (2005)

    Article  Google Scholar 

  7. Mahesh Kumar, K., Rama Mohan Reddy, A.: A density based algorithm for discovering clusters in large spatial databases with noise. Pattern Recogn. 58, 39–48 (2016)

    Article  Google Scholar 

  8. Ankerst, M., et al.: OPTICS: ordering points to identify clustering structure. In: Proceedings of ACM SIGMOD Conference on Management of Data. ACM Press, Philadelphia (1999)

    Google Scholar 

  9. Saxena, A., Prasad, M., Gupta, A., Bharill, N., Patel, O.P., Tiwari, A., Er, M.J., Ding, W., Lin, C.-T.: A review of clustering techniques and developments. Neurocomputing 267, 664–681 (2017)

    Article  Google Scholar 

  10. Berkhin, P.: Survey of Clustering Data Mining techniques. Accrue Software Inc., San Jose (2000)

    Google Scholar 

  11. Yu, H., Zhi, X., Fan, J.: Image segmentation based on weak fuzzy partition entropy. Neurocomputing 168, 994–1010 (2015)

    Article  Google Scholar 

  12. Bezdek, J.C. (ed.): Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)

    MATH  Google Scholar 

  13. Zadeh, L.A.: Outline of a new approach to the analysis of complex systems and decision processes. IEEE Trans. Syst. Man Cybern. 3, 28–44 (1973)

    Article  MathSciNet  Google Scholar 

  14. Deschrijver, G., Cornelis, C., Kerre, E.E.: On the representation of intuitionistic fuzzy t-norms and t-conorms. IEEE Trans. Fuzzy Syst. 12, 45–61 (2004)

    Article  Google Scholar 

  15. Yuan, X., Li, H., Zhang, C.: The theory of intuitionistic fuzzy sets based on the intuitionistic fuzzy special sets. Inf. Sci. 277, 284–298 (2014)

    Article  MathSciNet  Google Scholar 

  16. Halkidi, M., Gunopulos, D., Vazirgiannis, M., et al.: A clustering framework based on subjective and objective validity criteria. ACM Trans. Knowl. Disc. Data 1(4), 1–25 (2008)

    Article  Google Scholar 

  17. Zhang, H.-M., Xu, Z.-S., Chen, Q.: On clustering approach to intuitionistic fuzzy sets. Control Decis. 22, 882 (2007)

    MathSciNet  MATH  Google Scholar 

  18. Dhillon, I., Guan, Y., Kulis, B.: Kernel k-means: spectral clustering and normalized cuts. In: Proceeding of KDD, Proceedings of 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 551–556 (2004)

    Google Scholar 

  19. Dhillon, I., Modha, D.: Concept decompositions for large sparse text data using clustering. Mach. Learn. 42(1–2), 143–175 (2001)

    Article  Google Scholar 

  20. de Amorim, R.C., Mirkin, B.: Minkowski metric, feature weighting and anomalous cluster initializing in k-means clustering. Pattern Recogn. 45(3), 1061–1075 (2012)

    Article  Google Scholar 

  21. Pelleg, D., Moore, A.W.: X-means: extending k-means with efficient estimation of the number of clusters. In: Proceedings of 17th International Conference on Machine Learning, pp. 727–734. Morgan Kaufmann (2000)

    Google Scholar 

  22. Cai, X., Nie, F., Huang, H.: Multi-view k-means clustering on big data. In: Rossi, F. (ed.) Proceedings of 23rd International Joint Conference on Artificial Intelligence, IJCAI 2013. IJCAI/AAAI (2013)

    Google Scholar 

  23. Mahesh Kumar, K., Rama Mohan Reddy, A.: An efficient k-means clustering filtering algorithm using density based initial cluster centers. Inf. Sci. 418, 286–301 (2017)

    Article  MathSciNet  Google Scholar 

  24. Klir, G.J., Yuan, B.: Fuzzy Sets and Fuzzy Logic Theory and Applications. Prentice Hall of India Private Limited, New Delhi (2002)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hechmi Shili .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Shili, H., Romdhane, L.B. (2018). IF-CLARANS: Intuitionistic Fuzzy Algorithm for Big Data Clustering. In: Medina, J., et al. Information Processing and Management of Uncertainty in Knowledge-Based Systems. Theory and Foundations. IPMU 2018. Communications in Computer and Information Science, vol 854. Springer, Cham. https://doi.org/10.1007/978-3-319-91476-3_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-91476-3_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-91475-6

  • Online ISBN: 978-3-319-91476-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics