Skip to main content

On Fuzzy Cluster Validity Indexes for High Dimensional Feature Space

  • Conference paper
  • First Online:
Book cover Advances in Fuzzy Logic and Technology 2017 (EUSFLAT 2017, IWIFSGN 2017)

Abstract

Fuzzy document clustering aims at automatically organizing related documents into clusters in a flexible way. At this context, the topics identification addressed by documents in every cluster is performed by automatically discovering cluster descriptors, which are relevant terms present in these documents. Since documents are represented by a high-dimensional feature space, the extraction of good descriptors is a big problem to be solved. This problem is even bigger using fuzzy clustering, since the same descriptor can be representative for more than one cluster. Moreover, it is well-known that the Fuzzy C-Means clustering algorithm is also affected by documents dimensionality and the choice of correct partition of a given document collection into clusters is still a challenging problem. In order to overcome this drawback, we have investigated the most common fuzzy clustering validity indexes to validate the organization of data with high dimensional feature space, since they are commonly used to evaluate fuzzy clusters from low dimensional data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://sites.labic.icmc.usp.br/pretext2/.

References

  1. Bezdek, J.C.: Numerical taxonomy with fuzzy sets. J. Math. Biol. 1(1), 57–71 (1974). doi:10.1007/BF02339490

    Article  MathSciNet  MATH  Google Scholar 

  2. Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Kluwer Academic Publishers, Norwell, MA (1981)

    Book  MATH  Google Scholar 

  3. Bezdek, J.C.: Cluster validity with fuzzy sets. J. Cybern. 3(3), 58–73 (1974)

    Article  MathSciNet  MATH  Google Scholar 

  4. Campello, R., Hruschka, E.: A fuzzy extension of the silhouette width criterion for cluster analysis. Fuzzy Sets Syst. 157(21), 2858–2875 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  5. Carvalho, N.V., Rezende, S.O., Camargo, H.A., Nogueira, T.M.: Flexible document organization by mixing fuzzy and possibilistic clustering algorithms. In: IEEE International Conference on Fuzzy Systems, pp. 790–797 (2016)

    Google Scholar 

  6. Chiang, I.J., Liu, C.H., Tsai, Y.H., Kumar, A.: Discovering latent semantics in web documents using fuzzy clustering. IEEE Trans. Fuzzy Syst. 23(6), 2122–2134 (2015)

    Article  Google Scholar 

  7. Dave, R.N.: Validating fuzzy partitions obtained through c-shells clustering. Pattern Recogn. Lett. 17(6), 613–623 (1996)

    Article  MathSciNet  Google Scholar 

  8. Fukuyama, Y., Sugeno, M.: A new method of choosing the number of clusters for fuzzy c-means method. In: Fuzzy Systems Symposium, pp. 247–250 (1989)

    Google Scholar 

  9. Ingwersen, P.: Information Retrieval Interaction. Taylor Graham, London (1992)

    Google Scholar 

  10. Nogueira, T.M., Rezende, S.O., Camargo, H.A.: Fuzzy cluster descriptor extraction for flexible organization of documents. In: International Conference on Hybrid Intelligent Systems, pp. 528–533 (2011)

    Google Scholar 

  11. Nogueira, T.M., Rezende, S.O., Camargo, H.A.: Fuzzy cluster descriptors improve flexible organization of documents. In: International Conference on Intelligent Systems Design and Applications, pp. 616–621 (2012)

    Google Scholar 

  12. Nogueira, T.M., Rezende, S.O., Camargo, H.A.: Flexible document organization: comparing fuzzy and possibilistic approaches. In: IEEE International Conference on Fuzzy Systems, pp. 1–8 (2015)

    Google Scholar 

  13. Pal, N.R., Bezdek, J.C.: On cluster validity for the fuzzy c-means model. IEEE Trans. Fuzzy Syst. 3(3), 370–379 (1995)

    Article  Google Scholar 

  14. Read, J., Reutemann, P., Pfahringer, B., Holmes, G.: MEKA: A multi-label/multi-target extension to Weka. J. Mach. Learn. Res. 17(21), 1–5 (2016)

    MathSciNet  MATH  Google Scholar 

  15. Shanahan, J., Roma, N.: Improving SVM text classification performance through threshold adjustment. Machine Learning, Lecture Notes in Computer Science, vol. 2837, pp. 361–372 (2003)

    Google Scholar 

  16. Soares, M.V.B., Prati, R.C., Monard, M.C.: PreTexT II: Description of restructuring tool preprocessing of texts. Technical report 333, ICMC-USP (2008). (in Portuguese)

    Google Scholar 

  17. Subhashini, R., Kumar, V.: Evaluating the performance of similarity measures used in document clustering and information retrieval. In: International Conference on Integrated Intelligent Computing, pp. 27–31 (2010)

    Google Scholar 

  18. Wang, W., Zhang, Y.: On fuzzy cluster validity indices. Fuzzy Sets Syst. 158(19), 2095–2117 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  19. Xie, X.L., Beni, G.: A validity measure for fuzzy clustering. IEEE Trans. Pattern Anal. Mach. Intell. 13(8), 841–847 (1991)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tatiane Nogueira .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Cite this paper

Eustáquio, F., Camargo, H., Rezende, S., Nogueira, T. (2018). On Fuzzy Cluster Validity Indexes for High Dimensional Feature Space. In: Kacprzyk, J., Szmidt, E., Zadrożny, S., Atanassov, K., Krawczak, M. (eds) Advances in Fuzzy Logic and Technology 2017. EUSFLAT IWIFSGN 2017 2017. Advances in Intelligent Systems and Computing, vol 642. Springer, Cham. https://doi.org/10.1007/978-3-319-66824-6_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-66824-6_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-66823-9

  • Online ISBN: 978-3-319-66824-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics