Skip to main content

An Optimized Algorithm for Text Clustering Based on F-Space

  • Conference paper
  • First Online:
  • 828 Accesses

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 443))

Abstract

In this paper, firstly, we express the text filter as a pretreatment of sequences of syllabic words by using the theory of factor space (F-space) to determine the basic factor set and then to do factor decomposition. Secondly, we structure text’s factor matching vectors and their similarity degree, and then to achieve a deeper characterization of concepts by factors’ similarity degree between texts, so as to realize the mining of text clustering. Finally, we randomly selected 90 articles on sogou.com as the experimental object, and verify the proposed algorithm by using plane division method and hierarchical agglomerative clustering these two algorithms. The results show that: (1) the clustering accuracies reaches 91 % and 94 % respectively; (2) the classification result obtained from proposed algorithm in this paper has little difference from the results of manual annotation method; (3) the effect based on hierarchical agglomerative clustering has better performance when compared with the plane division method. This paper provides a new feasible method for text mining fields.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Cheng, X.Y., Zhu, Q.: Text Mining Principle. Science Press, Beijing (2010)

    Google Scholar 

  2. Wang, P.Z., Sugeno, M.: The factors field and background structure for fuzzy subsets. Fuzzy Math. 2(2), 45–54 (1982)

    MathSciNet  Google Scholar 

  3. Zhuang, W.P., Li, H.X.: Mathematical Theory of Knowledge Representation. Tianjin Science and Technology Press, Tianjin (1994)

    Google Scholar 

  4. Li, H.X.: Mathematical framework factor space theory and knowledge representation—factor space frame axiomatic definition and description. Beijing Normal Univ. (Nat. Sci.) (1996)

    Google Scholar 

  5. Liu, Y.: Implicit policy kinship mining domain-oriented Harbin. Harbin Eng. Univ. (2013)

    Google Scholar 

  6. Sogou Sogou laboratory data download - text classification corpus. http://www.sogou.com/labs/dl/c.html

  7. Zhou, Z.T.: Text Clustering Analysis Evaluation and Text Representation. Beijing, Chinese Academy of Sciences (2005)

    Google Scholar 

  8. Wang, P.Z.: A factor spaces approach to knowledge representation. Fuzzy Sets Syst. (1990)

    Google Scholar 

  9. Luo, C.Z., Yu, F.S.: Mathematical models and expert system development tool. Fuzzy Syst. Math. 6(2), 20 (1992)

    MathSciNet  Google Scholar 

  10. He, Q., Tong, Z.M.: A method of forming the concept of factors of space and fuzzy clustering. Syst. Eng. Theory Pract. (1999)

    Google Scholar 

  11. Liu, Y.M.: Feature extraction and classification factor space. Beijing Normal Univ. (Nat. Sci.) 36 (2), 172–177 (2000)

    Google Scholar 

  12. Wang, P.Z., Li, H.X.: Fuzzy System Theory and Fuzzy Computer. Science Press, Beijing (1996)

    Google Scholar 

Download references

Acknowledgements

This research was supported by Information and Computing Science Outstanding Talent Training project of Guangdong Province. (No.20153324)

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yu-Bin Zhong .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Zhong, YB., Li, Zj., Zhao, Mh., Yao, Wq. (2016). An Optimized Algorithm for Text Clustering Based on F-Space. In: Cao, BY., Wang, PZ., Liu, ZL., Zhong, YB. (eds) International Conference on Oriental Thinking and Fuzzy Logic. Advances in Intelligent Systems and Computing, vol 443. Springer, Cham. https://doi.org/10.1007/978-3-319-30874-6_44

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-30874-6_44

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-30873-9

  • Online ISBN: 978-3-319-30874-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics