Abstract
Text Document Clustering is one of the classic topics in text mining, which groups text documents in unsupervised way. There are various clustering techniques available to cluster text documents. Fuzzy C-Means (FCM) is one of the popular fuzzy-clustering algorithm. Unfortunately, Fuzzy C-Means algorithm is too sensitive to noise. Possibilistic C-Means overcomes this drawback by releasing the probabilistic constraint of the membership function. In this paper, we proposed a Kernel Possibilistic C-Means (KPCM) method for Text Document Clustering. Unlike the classical Possibilistic C-Means algorithm, the proposed method employs the kernel distance metric to calculate the distance between the cluster center and text document. We used standard 20NewsGroups dataset for experimentation and conducted comparison between proposed method (KPCM), Fuzzy C-Means, Kernel Fuzzy C-Means and Possibilistic C-Means. The experimental results reveal that the Kernel Possibilistic C-Means outperforms the other methods in terms of accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Win TT, Mon L (2010) Document clustering by fuzzy c-mean algorithm. In: Advanced computer control (ICACC), IEEE 2010 2nd international conference, vol 1, pp 239–242
James CB, Robert E, William F (1984) FCM: the fuzzy c-means clustering algorithm. Comput Geosci 191–203
Harish BS, Prasad B, Udayasri B (2014) Classification of text documents using adaptive fuzzy c-means clustering. In: Recent advances in intelligent informatics. Springer International Publishing, pp 205–214
Chuai AS, Lursinsap C, Sophasathit P, Siripant S (2001) Fuzzy C-Mean: A statistical feature classification of text and image segmentation method. Int J Uncertain Fuzziness Knowl-Based Syst 06:661–671
Mei JP, Wang Y, Chen L, Miao C (2014) Incremental fuzzy clustering for document categorization. In: 2014 IEEE international conference on fuzzy systems (FUZZ-IEEE), pp 1518–1525
Bezdek JC, Pal MR, Keller J, Krishnapuram R (1999) Fuzzy models and algorithms for pattern recognition and image processing [M]. Kluwer Academic, Massaschusetts
Wu Z, Xie W, Yu J (2003) Fuzzy C-means clustering algorithm based on kernel method. In: IEEE conference on computational intelligence and multimedia applications, pp 49–54
Krishnapuram R, Keller JM (1993) A possibilistic approach to clustering. IEEE Trans Fuzzy Syst 1(2):98–110
Yang MS, Wu KL (2006) Unsupervised possibilistic clustering. Pattern Recognit 39: 5–21
Timm H, Borgelt C, Doring C, Kruse R (2004) An extension to possibilistic fuzzy cluster analysis. Fuzzy Sets Syst 147(1):3–16
Zhang JS, Leung YW (2004) Improved possibilistic c-means clustering algorithms. IEEE Trans Fuzzy Syst 12(2):209–217
Krishnapuram R, Keller JM (1996) The possibilistic c-means algorithm: insights and recommendations. IEEE Trans Fuzzy Syst 4(3):385–393
Saad MF, Alimi AM (2009) Modified fuzzy possibilistic c-means. In: Proceeding of the international Multi conference of engineers and computer scientists, vol 1. Hong Kong
Rhee FCH, Choi KS, Choi BI (2009) Kernel approach to possibilistic C-means clustering. Int J Intell Syst 24(3):272–292
Mizutani K, Miyamoto S (2005) Possibilistic approach to kernel-based fuzzy c-means clustering with entropy regularization. In: International conference on modeling decisions for artificial intelligence. Springer, Berlin, pp 144–155
Raza MA, Rhee FCH (2012) Interval type-2 approach to kernel possibilistic c-means clustering. In: Fuzzy systems (FUZZ-IEEE), 2012 IEEE international conference, pp 1–7
Tjhi WC, Chen L (2009) Dual fuzzy-possibilistic coclustering for categorization of documents. IEEE Trans Fuzzy Syst 17(3):532–543
Tjhi WC, Chen L (2007) Possibilistic fuzzy co-clustering of large document collections. Pattern Recogn 40(12):3452–3466
Nogueira TM, Rezende SO, Camargo HA (2015) Flexible document organization: comparing fuzzy and possibilistic approaches. In: Fuzzy systems (FUZZ-IEEE), 2015 IEEE international conference, pp 1–8
Cai D, He X, Zhang WV, Han J (2007) Regularized locality preserving indexing via spectral regression. In: Proceedings of the sixteenth ACM conference on conference on information and knowledge management, ACM, pp 741–750
Zhang DQ (2003) Kernel-based fuzzy clustering incorporating spatial constraints for image segmentation. In: Proceedings of the international conference on machine learning and cybernetics, pp 2189–2192
20 Newsgroups: http://www.cad.zju.edu.cn/home/dengcai/Data/TextData.html
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Revanasiddappa, M.B., Harish, B.S., Aruna Kumar, S.V. (2018). Clustering Text Documents Using Kernel Possibilistic C-Means. In: Guru, D., Vasudev, T., Chethan, H., Kumar, Y. (eds) Proceedings of International Conference on Cognition and Recognition . Lecture Notes in Networks and Systems, vol 14. Springer, Singapore. https://doi.org/10.1007/978-981-10-5146-3_13
Download citation
DOI: https://doi.org/10.1007/978-981-10-5146-3_13
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-5145-6
Online ISBN: 978-981-10-5146-3
eBook Packages: EngineeringEngineering (R0)