Abstract
Clustering high-dimensional data has been a major concern owing to the intrinsic sparsity of the data points. Several recent research results signify that in case of high-dimensional data, even the notion of proximity or clustering possibly will not be significant. Fuzzy c-means (FCM) and possibilistic c-means (PCM) have the capability to handle the high-dimensional data, whereas FCM is sensitive to noise and PCM requires appropriate initialization to converge to nearly global minimum. Hence, to overcome this issue, a fuzzy possibilistic c-means (FPCM) with symmetry-based distance measure has been proposed which can find out the number of clusters that exist in a dataset. Also, an efficient kernelized fuzzy possibilistic c-means (KFPCM) algorithm has been proposed for effective clustering results. The proposed KFPCM uses a distance measure which is based on the kernel-induced distance measure. FPCM combines the advantages of both FCM and PCM; moreover, the kernel-induced distance measure helps in obtaining better clustering results in case of high-dimensional data. The proposed KFPCM is evaluated using datasets such as Iris, Wine, Lymphography, Lung Cancer, and Diabetes in terms of clustering accuracy, number of iterations, and execution time. The results prove the effectiveness of the proposed KFPCM.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Yip, K.Y., Cheung, D.W., Ng, M.K.: On discovery of extremely low-dimensional clusters using semi-supervised projected clustering. In: ICDE (2005)
Moise, G., et al.: P3C: a robust projected clustering algorithm. Department of Computing Science, University of Alberta
Aggarwal, C.C., et al.: A framework for projected clustering of high dimensional data streams. In: Proceedings of the 30th VLDB Conference, Toronto, Canada (2004)
Papadopoulos, D., Gunopulos, D., Ma, S.: Subspace clustering of high dimensional data. In: SIAM (2004)
Kohane, I.S., Kho, A., Butte, A.J.: Microarrays for an Integrative Genomics. MIT Press, Massachusetts (2002)
Raychaudhuri, S., Sutphin, P.D., Chang, J.T., Altman, R.B.: Basic microarray analysis: grouping and feature reduction. Trends Biotechnol. 19(5), 189–193 (2001)
Parsons, L., Haque, E., Liu, H.: Subspace clustering for high dimensional data: a review. SIGKDD Explor. Newsl. 6(1), 90–105 (2004)
Havens, T.C., Chitta, R., Jain, A.K., Jin, R.: Speedup of Fuzzy and possibilistic kernel C-means for large-scale clustering. Department of Computer Science and Engineering, Michigan State University, East Lansing
Günnemann, S., et al.: Subspace clustering for indexing high dimensional data: a main memory index based on local reductions and individual multi-representations. In: Proceedings of the 14th International Conference on Extending Database Technology, EDBT 2011, Uppsala, Sweden, 22–24 Mar 2011
Zhang, D.-Q., Chen, S.-C.: Kernel-based fuzzy and possibilistic C-means clustering. Nanjing University of Aeronautics and Astronautics, Nanjing
Vanisri, D., Loganathan, C.: An efficient fuzzy possibilistic C means with penalized and compensated constraints. Glob. J. Comput. Sci. Technol. 11(3), (2011)
Beyer, K.S., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is “nearest neighbor” meaningful? In: Proceeding of the 7th International Conference on Database Theory (ICDT ’99) 1999
Hinneburg, A., Aggarwal, C.C., Keim, D.A.: What is the Nearest Neighbor in high dimensional spaces? In: Proceedings of International Conference on Very Large Data Bases (VLDB ’00) 2000
Chu, Y.H., Huang, J.W., Chuang, K.T., Yang, D.N., Chen, M.S.: Density conscious subspace clustering for high-dimensional data. IEEE Trans. Knowl. Data Eng. 22(1), (2010)
Frigui, H.: Simultaneous clustering and feature discrimination with applications. In: Advances in Fuzzy Clustering and Feature Discrimination with Applications, pp. 285–312. Wiley, New York (2007)
Sledge, I., Havens, T., Bezdek, J., Keller, J.: Relational duals of cluster validity functions for the C-means family. IEEE Trans. Fuzzy Syst. 18(6), 1160–1170 (2010)
Namkoong, Y., Heo, G., Woo, Y.W. : An extension of possibilistic fuzzy C-means with regularization. In: IEEE International Conference on Fuzzy Systems (FUZZ), pp. 1–6 (2010)
Yan, Y., Chen, L.: Hyperspherical possibilistic fuzzy c-means for High dimensional data clustering. In: 7th International Conference on Information, Communications and Signal Processing (2009)
Wu, W.H., Zhou, J.J.: Possibilistic fuzzy c-means clustering model using kernel methods. Comput. Intell. Model. Control Autom. 2, 465–470 (2005)
Sun, Y., Liu, G., Xu, K.: A k-means-based projected clustering algorithm. In: Third International Joint Conference on Computational Science and Optimization (CSO), vol. 1, pp. 466–470 (2010)
Olive, D.J.: Applied Robust Statistics. Carbondale, 62901-4408 (2008)
Agrawal, R., Gehrkem, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. In: Haas, L., Tiwary, A. (eds) Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 94–105. Seattle ,WA (1998)
Jia, K., He, M., Cheng, T.: A new similarity measure based robust possibilistic C-means clustering algorithm. Lect. Notes Comput. Sci. 6988, 335–342 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer India
About this paper
Cite this paper
Shanmugapriya, B., Punithavalli, M. (2015). An Efficient Kernelized Fuzzy Possibilistic C-Means for High-Dimensional Data Clustering. In: Sethi, I. (eds) Computational Vision and Robotics. Advances in Intelligent Systems and Computing, vol 332. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2196-8_26
Download citation
DOI: https://doi.org/10.1007/978-81-322-2196-8_26
Published:
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-2195-1
Online ISBN: 978-81-322-2196-8
eBook Packages: EngineeringEngineering (R0)