Skip to main content

A Clustering Algorithm for Chinese Text Based on SOM Neural Network and Density

  • Conference paper
Advances in Neural Networks – ISNN 2005 (ISNN 2005)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3497))

Included in the following conference series:

  • 1582 Accesses

Abstract

This paper introduces a clustering algorithm for Chinese text based on both SOM (Self-Organizing Map) neural network and density. The algorithm contains two stages. During the first stage, Chinese text are transformed into text vectors, which are used as training data of SOM and mapped by training SOM so that an initial clustering result for text data, i.e., a virtual coordinates set, is obtained. Then, during the second stage, the virtual coordinates set is further clustered according to density. It should be pointed out that the proposed algorithm in the first stage is different from the existing ones. Moreover, in the second stage, it outperforms other algorithms in computing time due to decreasing dimension. Numerical experiment shows that the algorithm is efficient for clustering text data and high multi-dimensional data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Zhou, S.G., Zhou, A.Y.: FDBSCAN:A Fast DBSCAN Algorithm. Journal of Software 11, 735–744 (2000)

    Google Scholar 

  2. Ma, S., Wang, T.J., Tang, S.W., Yang, D.Q., Gao, J.: A Fast Clustering Algorithm Based on Reference and Density. Journal of Software 14, 1089–1095 (2003) (in Chinese)

    MATH  Google Scholar 

  3. Chen, N., Chen, A.: An Incremental Density-Based Clustering Algorithm. Journal of Software 11, 1–7 (2002)

    Google Scholar 

  4. Juha, V.: Clustering of the self-organizing Map. IEEE Transaction on Neural Networks 11, 586–600 (2000)

    Article  Google Scholar 

  5. Rauber, A., Merkl, D.: The Growing Hierarchical Self-Organizing Maps Exploratory Analysis of High-Dimensional Data. IEEE Transactions on Neural Networks 13, 1331–1341 (2002)

    Article  Google Scholar 

  6. Hung, C., Wermter, S.: A Self-Organising Hybrid Model for Dynamic for Text Clustering (2003), http://citeseer.ist.psu.edu/646370.html

  7. Elias, P.: A New Approach to Hierarchical Clustering and Structuring of Data with Self-Organizing Maps. Intelligent Data Analysis Journal, 1–23 (2003) (extended version)

    Google Scholar 

  8. Kohonen, T., Kaski, S., Lagus, K., Salojarvi, J., Honkela, J., Paatero, V., Saarela, A.: Self Organization of a Massive Document Collection. IEEE Transactions on Neural Networks 11, 574–585 (2000)

    Article  Google Scholar 

  9. Xu, J.S., Wang, Z.O., Wang, L.: A Novel Approach of Chinese Text Clustering Based on Self-Organizing Neural Network. Journal of Information 22, 676–680 (2003) (in Chinese)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Meng, Z., Zhu, H., Zhu, Y., Zhou, G. (2005). A Clustering Algorithm for Chinese Text Based on SOM Neural Network and Density. In: Wang, J., Liao, XF., Yi, Z. (eds) Advances in Neural Networks – ISNN 2005. ISNN 2005. Lecture Notes in Computer Science, vol 3497. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11427445_40

Download citation

  • DOI: https://doi.org/10.1007/11427445_40

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25913-8

  • Online ISBN: 978-3-540-32067-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics