Skip to main content

Clustering Text Documents Using Kernel Possibilistic C-Means

  • Conference paper
  • First Online:
Proceedings of International Conference on Cognition and Recognition

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 14))

Abstract

Text Document Clustering is one of the classic topics in text mining, which groups text documents in unsupervised way. There are various clustering techniques available to cluster text documents. Fuzzy C-Means (FCM) is one of the popular fuzzy-clustering algorithm. Unfortunately, Fuzzy C-Means algorithm is too sensitive to noise. Possibilistic C-Means overcomes this drawback by releasing the probabilistic constraint of the membership function. In this paper, we proposed a Kernel Possibilistic C-Means (KPCM) method for Text Document Clustering. Unlike the classical Possibilistic C-Means algorithm, the proposed method employs the kernel distance metric to calculate the distance between the cluster center and text document. We used standard 20NewsGroups dataset for experimentation and conducted comparison between proposed method (KPCM), Fuzzy C-Means, Kernel Fuzzy C-Means and Possibilistic C-Means. The experimental results reveal that the Kernel Possibilistic C-Means outperforms the other methods in terms of accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Win TT, Mon L (2010) Document clustering by fuzzy c-mean algorithm. In: Advanced computer control (ICACC), IEEE 2010 2nd international conference, vol 1, pp 239–242

    Google Scholar 

  2. James CB, Robert E, William F (1984) FCM: the fuzzy c-means clustering algorithm. Comput Geosci 191–203

    Google Scholar 

  3. Harish BS, Prasad B, Udayasri B (2014) Classification of text documents using adaptive fuzzy c-means clustering. In: Recent advances in intelligent informatics. Springer International Publishing, pp 205–214

    Google Scholar 

  4. Chuai AS, Lursinsap C, Sophasathit P, Siripant S (2001) Fuzzy C-Mean: A statistical feature classification of text and image segmentation method. Int J Uncertain Fuzziness Knowl-Based Syst 06:661–671

    MATH  Google Scholar 

  5. Mei JP, Wang Y, Chen L, Miao C (2014) Incremental fuzzy clustering for document categorization. In: 2014 IEEE international conference on fuzzy systems (FUZZ-IEEE), pp 1518–1525

    Google Scholar 

  6. Bezdek JC, Pal MR, Keller J, Krishnapuram R (1999) Fuzzy models and algorithms for pattern recognition and image processing [M]. Kluwer Academic, Massaschusetts

    Book  MATH  Google Scholar 

  7. Wu Z, Xie W, Yu J (2003) Fuzzy C-means clustering algorithm based on kernel method. In: IEEE conference on computational intelligence and multimedia applications, pp 49–54

    Google Scholar 

  8. Krishnapuram R, Keller JM (1993) A possibilistic approach to clustering. IEEE Trans Fuzzy Syst 1(2):98–110

    Google Scholar 

  9. Yang MS, Wu KL (2006) Unsupervised possibilistic clustering. Pattern Recognit 39: 5–21

    Google Scholar 

  10. Timm H, Borgelt C, Doring C, Kruse R (2004) An extension to possibilistic fuzzy cluster analysis. Fuzzy Sets Syst 147(1):3–16

    Article  MathSciNet  MATH  Google Scholar 

  11. Zhang JS, Leung YW (2004) Improved possibilistic c-means clustering algorithms. IEEE Trans Fuzzy Syst 12(2):209–217

    Article  Google Scholar 

  12. Krishnapuram R, Keller JM (1996) The possibilistic c-means algorithm: insights and recommendations. IEEE Trans Fuzzy Syst 4(3):385–393

    Article  Google Scholar 

  13. Saad MF, Alimi AM (2009) Modified fuzzy possibilistic c-means. In: Proceeding of the international Multi conference of engineers and computer scientists, vol 1. Hong Kong

    Google Scholar 

  14. Rhee FCH, Choi KS, Choi BI (2009) Kernel approach to possibilistic C-means clustering. Int J Intell Syst 24(3):272–292

    Article  MATH  Google Scholar 

  15. Mizutani K, Miyamoto S (2005) Possibilistic approach to kernel-based fuzzy c-means clustering with entropy regularization. In: International conference on modeling decisions for artificial intelligence. Springer, Berlin, pp 144–155

    Google Scholar 

  16. Raza MA, Rhee FCH (2012) Interval type-2 approach to kernel possibilistic c-means clustering. In: Fuzzy systems (FUZZ-IEEE), 2012 IEEE international conference, pp 1–7

    Google Scholar 

  17. Tjhi WC, Chen L (2009) Dual fuzzy-possibilistic coclustering for categorization of documents. IEEE Trans Fuzzy Syst 17(3):532–543

    Google Scholar 

  18. Tjhi WC, Chen L (2007) Possibilistic fuzzy co-clustering of large document collections. Pattern Recogn 40(12):3452–3466

    Article  MATH  Google Scholar 

  19. Nogueira TM, Rezende SO, Camargo HA (2015) Flexible document organization: comparing fuzzy and possibilistic approaches. In: Fuzzy systems (FUZZ-IEEE), 2015 IEEE international conference, pp 1–8

    Google Scholar 

  20. Cai D, He X, Zhang WV, Han J (2007) Regularized locality preserving indexing via spectral regression. In: Proceedings of the sixteenth ACM conference on conference on information and knowledge management, ACM, pp 741–750

    Google Scholar 

  21. Zhang DQ (2003) Kernel-based fuzzy clustering incorporating spatial constraints for image segmentation. In: Proceedings of the international conference on machine learning and cybernetics, pp 2189–2192

    Google Scholar 

  22. 20 Newsgroups: http://www.cad.zju.edu.cn/home/dengcai/Data/TextData.html

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. B. Revanasiddappa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper

Revanasiddappa, M.B., Harish, B.S., Aruna Kumar, S.V. (2018). Clustering Text Documents Using Kernel Possibilistic C-Means. In: Guru, D., Vasudev, T., Chethan, H., Kumar, Y. (eds) Proceedings of International Conference on Cognition and Recognition . Lecture Notes in Networks and Systems, vol 14. Springer, Singapore. https://doi.org/10.1007/978-981-10-5146-3_13

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-5146-3_13

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-5145-6

  • Online ISBN: 978-981-10-5146-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics