An Efficient Kernelized Fuzzy Possibilistic C-Means for High-Dimensional Data Clustering

Shanmugapriya, B.; Punithavalli, M.

doi:10.1007/978-81-322-2196-8_26

B. Shanmugapriya³ &
M. Punithavalli³

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 332))

1263 Accesses

Abstract

Clustering high-dimensional data has been a major concern owing to the intrinsic sparsity of the data points. Several recent research results signify that in case of high-dimensional data, even the notion of proximity or clustering possibly will not be significant. Fuzzy c-means (FCM) and possibilistic c-means (PCM) have the capability to handle the high-dimensional data, whereas FCM is sensitive to noise and PCM requires appropriate initialization to converge to nearly global minimum. Hence, to overcome this issue, a fuzzy possibilistic c-means (FPCM) with symmetry-based distance measure has been proposed which can find out the number of clusters that exist in a dataset. Also, an efficient kernelized fuzzy possibilistic c-means (KFPCM) algorithm has been proposed for effective clustering results. The proposed KFPCM uses a distance measure which is based on the kernel-induced distance measure. FPCM combines the advantages of both FCM and PCM; moreover, the kernel-induced distance measure helps in obtaining better clustering results in case of high-dimensional data. The proposed KFPCM is evaluated using datasets such as Iris, Wine, Lymphography, Lung Cancer, and Diabetes in terms of clustering accuracy, number of iterations, and execution time. The results prove the effectiveness of the proposed KFPCM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Yip, K.Y., Cheung, D.W., Ng, M.K.: On discovery of extremely low-dimensional clusters using semi-supervised projected clustering. In: ICDE (2005)
Google Scholar
Moise, G., et al.: P3C: a robust projected clustering algorithm. Department of Computing Science, University of Alberta
Google Scholar
Aggarwal, C.C., et al.: A framework for projected clustering of high dimensional data streams. In: Proceedings of the 30th VLDB Conference, Toronto, Canada (2004)
Google Scholar
Papadopoulos, D., Gunopulos, D., Ma, S.: Subspace clustering of high dimensional data. In: SIAM (2004)
Google Scholar
Kohane, I.S., Kho, A., Butte, A.J.: Microarrays for an Integrative Genomics. MIT Press, Massachusetts (2002)
Google Scholar
Raychaudhuri, S., Sutphin, P.D., Chang, J.T., Altman, R.B.: Basic microarray analysis: grouping and feature reduction. Trends Biotechnol. 19(5), 189–193 (2001)
Article Google Scholar
Parsons, L., Haque, E., Liu, H.: Subspace clustering for high dimensional data: a review. SIGKDD Explor. Newsl. 6(1), 90–105 (2004)
Article Google Scholar
Havens, T.C., Chitta, R., Jain, A.K., Jin, R.: Speedup of Fuzzy and possibilistic kernel C-means for large-scale clustering. Department of Computer Science and Engineering, Michigan State University, East Lansing
Google Scholar
Günnemann, S., et al.: Subspace clustering for indexing high dimensional data: a main memory index based on local reductions and individual multi-representations. In: Proceedings of the 14th International Conference on Extending Database Technology, EDBT 2011, Uppsala, Sweden, 22–24 Mar 2011
Google Scholar
Zhang, D.-Q., Chen, S.-C.: Kernel-based fuzzy and possibilistic C-means clustering. Nanjing University of Aeronautics and Astronautics, Nanjing
Google Scholar
Vanisri, D., Loganathan, C.: An efficient fuzzy possibilistic C means with penalized and compensated constraints. Glob. J. Comput. Sci. Technol. 11(3), (2011)
Google Scholar
Beyer, K.S., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is “nearest neighbor” meaningful? In: Proceeding of the 7th International Conference on Database Theory (ICDT ’99) 1999
Google Scholar
Hinneburg, A., Aggarwal, C.C., Keim, D.A.: What is the Nearest Neighbor in high dimensional spaces? In: Proceedings of International Conference on Very Large Data Bases (VLDB ’00) 2000
Google Scholar
Chu, Y.H., Huang, J.W., Chuang, K.T., Yang, D.N., Chen, M.S.: Density conscious subspace clustering for high-dimensional data. IEEE Trans. Knowl. Data Eng. 22(1), (2010)
Google Scholar
Frigui, H.: Simultaneous clustering and feature discrimination with applications. In: Advances in Fuzzy Clustering and Feature Discrimination with Applications, pp. 285–312. Wiley, New York (2007)
Google Scholar
Sledge, I., Havens, T., Bezdek, J., Keller, J.: Relational duals of cluster validity functions for the C-means family. IEEE Trans. Fuzzy Syst. 18(6), 1160–1170 (2010)
Article Google Scholar
Namkoong, Y., Heo, G., Woo, Y.W. : An extension of possibilistic fuzzy C-means with regularization. In: IEEE International Conference on Fuzzy Systems (FUZZ), pp. 1–6 (2010)
Google Scholar
Yan, Y., Chen, L.: Hyperspherical possibilistic fuzzy c-means for High dimensional data clustering. In: 7th International Conference on Information, Communications and Signal Processing (2009)
Google Scholar
Wu, W.H., Zhou, J.J.: Possibilistic fuzzy c-means clustering model using kernel methods. Comput. Intell. Model. Control Autom. 2, 465–470 (2005)
Google Scholar
Sun, Y., Liu, G., Xu, K.: A k-means-based projected clustering algorithm. In: Third International Joint Conference on Computational Science and Optimization (CSO), vol. 1, pp. 466–470 (2010)
Google Scholar
Olive, D.J.: Applied Robust Statistics. Carbondale, 62901-4408 (2008)
Google Scholar
Agrawal, R., Gehrkem, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. In: Haas, L., Tiwary, A. (eds) Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 94–105. Seattle ,WA (1998)
Google Scholar
Jia, K., He, M., Cheng, T.: A new similarity measure based robust possibilistic C-means clustering algorithm. Lect. Notes Comput. Sci. 6988, 335–342 (2011)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Sri Ramakrishna College of Arts and Science for Women, Coimbatore, India
B. Shanmugapriya & M. Punithavalli

Authors

B. Shanmugapriya
View author publications
You can also search for this author in PubMed Google Scholar
M. Punithavalli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to B. Shanmugapriya .

Editor information

Editors and Affiliations

Department of Computer Science & Engineering, Oakland University, Rochester, Michigan, USA
Ishwar K. Sethi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shanmugapriya, B., Punithavalli, M. (2015). An Efficient Kernelized Fuzzy Possibilistic C-Means for High-Dimensional Data Clustering. In: Sethi, I. (eds) Computational Vision and Robotics. Advances in Intelligent Systems and Computing, vol 332. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2196-8_26

Download citation

DOI: https://doi.org/10.1007/978-81-322-2196-8_26
Published: 04 January 2015
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-2195-1
Online ISBN: 978-81-322-2196-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics