Skip to main content

An Efficient Kernelized Fuzzy Possibilistic C-Means for High-Dimensional Data Clustering

  • Conference paper
  • First Online:
Computational Vision and Robotics

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 332))

  • 1263 Accesses

Abstract

Clustering high-dimensional data has been a major concern owing to the intrinsic sparsity of the data points. Several recent research results signify that in case of high-dimensional data, even the notion of proximity or clustering possibly will not be significant. Fuzzy c-means (FCM) and possibilistic c-means (PCM) have the capability to handle the high-dimensional data, whereas FCM is sensitive to noise and PCM requires appropriate initialization to converge to nearly global minimum. Hence, to overcome this issue, a fuzzy possibilistic c-means (FPCM) with symmetry-based distance measure has been proposed which can find out the number of clusters that exist in a dataset. Also, an efficient kernelized fuzzy possibilistic c-means (KFPCM) algorithm has been proposed for effective clustering results. The proposed KFPCM uses a distance measure which is based on the kernel-induced distance measure. FPCM combines the advantages of both FCM and PCM; moreover, the kernel-induced distance measure helps in obtaining better clustering results in case of high-dimensional data. The proposed KFPCM is evaluated using datasets such as Iris, Wine, Lymphography, Lung Cancer, and Diabetes in terms of clustering accuracy, number of iterations, and execution time. The results prove the effectiveness of the proposed KFPCM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Yip, K.Y., Cheung, D.W., Ng, M.K.: On discovery of extremely low-dimensional clusters using semi-supervised projected clustering. In: ICDE (2005)

    Google Scholar 

  2. Moise, G., et al.: P3C: a robust projected clustering algorithm. Department of Computing Science, University of Alberta

    Google Scholar 

  3. Aggarwal, C.C., et al.: A framework for projected clustering of high dimensional data streams. In: Proceedings of the 30th VLDB Conference, Toronto, Canada (2004)

    Google Scholar 

  4. Papadopoulos, D., Gunopulos, D., Ma, S.: Subspace clustering of high dimensional data. In: SIAM (2004)

    Google Scholar 

  5. Kohane, I.S., Kho, A., Butte, A.J.: Microarrays for an Integrative Genomics. MIT Press, Massachusetts (2002)

    Google Scholar 

  6. Raychaudhuri, S., Sutphin, P.D., Chang, J.T., Altman, R.B.: Basic microarray analysis: grouping and feature reduction. Trends Biotechnol. 19(5), 189–193 (2001)

    Article  Google Scholar 

  7. Parsons, L., Haque, E., Liu, H.: Subspace clustering for high dimensional data: a review. SIGKDD Explor. Newsl. 6(1), 90–105 (2004)

    Article  Google Scholar 

  8. Havens, T.C., Chitta, R., Jain, A.K., Jin, R.: Speedup of Fuzzy and possibilistic kernel C-means for large-scale clustering. Department of Computer Science and Engineering, Michigan State University, East Lansing

    Google Scholar 

  9. Günnemann, S., et al.: Subspace clustering for indexing high dimensional data: a main memory index based on local reductions and individual multi-representations. In: Proceedings of the 14th International Conference on Extending Database Technology, EDBT 2011, Uppsala, Sweden, 22–24 Mar 2011

    Google Scholar 

  10. Zhang, D.-Q., Chen, S.-C.: Kernel-based fuzzy and possibilistic C-means clustering. Nanjing University of Aeronautics and Astronautics, Nanjing

    Google Scholar 

  11. Vanisri, D., Loganathan, C.: An efficient fuzzy possibilistic C means with penalized and compensated constraints. Glob. J. Comput. Sci. Technol. 11(3), (2011)

    Google Scholar 

  12. Beyer, K.S., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is “nearest neighbor” meaningful? In: Proceeding of the 7th International Conference on Database Theory (ICDT ’99) 1999

    Google Scholar 

  13. Hinneburg, A., Aggarwal, C.C., Keim, D.A.: What is the Nearest Neighbor in high dimensional spaces? In: Proceedings of International Conference on Very Large Data Bases (VLDB ’00) 2000

    Google Scholar 

  14. Chu, Y.H., Huang, J.W., Chuang, K.T., Yang, D.N., Chen, M.S.: Density conscious subspace clustering for high-dimensional data. IEEE Trans. Knowl. Data Eng. 22(1), (2010)

    Google Scholar 

  15. Frigui, H.: Simultaneous clustering and feature discrimination with applications. In: Advances in Fuzzy Clustering and Feature Discrimination with Applications, pp. 285–312. Wiley, New York (2007)

    Google Scholar 

  16. Sledge, I., Havens, T., Bezdek, J., Keller, J.: Relational duals of cluster validity functions for the C-means family. IEEE Trans. Fuzzy Syst. 18(6), 1160–1170 (2010)

    Article  Google Scholar 

  17. Namkoong, Y., Heo, G., Woo, Y.W. : An extension of possibilistic fuzzy C-means with regularization. In: IEEE International Conference on Fuzzy Systems (FUZZ), pp. 1–6 (2010)

    Google Scholar 

  18. Yan, Y., Chen, L.: Hyperspherical possibilistic fuzzy c-means for High dimensional data clustering. In: 7th International Conference on Information, Communications and Signal Processing (2009)

    Google Scholar 

  19. Wu, W.H., Zhou, J.J.: Possibilistic fuzzy c-means clustering model using kernel methods. Comput. Intell. Model. Control Autom. 2, 465–470 (2005)

    Google Scholar 

  20. Sun, Y., Liu, G., Xu, K.: A k-means-based projected clustering algorithm. In: Third International Joint Conference on Computational Science and Optimization (CSO), vol. 1, pp. 466–470 (2010)

    Google Scholar 

  21. Olive, D.J.: Applied Robust Statistics. Carbondale, 62901-4408 (2008)

    Google Scholar 

  22. Agrawal, R., Gehrkem, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. In: Haas, L., Tiwary, A. (eds) Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 94–105. Seattle ,WA (1998)

    Google Scholar 

  23. Jia, K., He, M., Cheng, T.: A new similarity measure based robust possibilistic C-means clustering algorithm. Lect. Notes Comput. Sci. 6988, 335–342 (2011)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to B. Shanmugapriya .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer India

About this paper

Cite this paper

Shanmugapriya, B., Punithavalli, M. (2015). An Efficient Kernelized Fuzzy Possibilistic C-Means for High-Dimensional Data Clustering. In: Sethi, I. (eds) Computational Vision and Robotics. Advances in Intelligent Systems and Computing, vol 332. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2196-8_26

Download citation

  • DOI: https://doi.org/10.1007/978-81-322-2196-8_26

  • Published:

  • Publisher Name: Springer, New Delhi

  • Print ISBN: 978-81-322-2195-1

  • Online ISBN: 978-81-322-2196-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics