Document Clustering Using the 1 + 1 Dimensional Self-Organising Map

  • Ben Russell
  • Hujun Yin
  • Nigel M. Allinson
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2412)


Automatic clustering of documents is a task that has become increasingly important with the explosion of online information. The Self Organising Map (SOM) has been used to cluster documents effectively, but efforts to date have used a single or a series of 2-dimensional maps. Ideally, the output of a document-clustering algorithm should be easy for a user to interpret. This paper describes a method of clustering documents using a series of 1-dimensional SOM arranged hierarchically to provide an intuitive tree structure representing document clusters. Wordnet is used to find the base forms of words and only cluster on words that can be nouns.


Quantisation Error Vector Space Model Document Cluster Document Vector Winning Neuron 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Blair, D.C., Maron M.E.: 1985. An evaluation of retrieval effectiveness for a full-text document-retrieval system. Communications of the ACM, 28 (1985)Google Scholar
  2. 2.
    van Rijsbergen, C., Information Retrieval, (1979)Google Scholar
  3. 3.
    Furnas, G.W., Landauer, T.K., Gomez, L.M., Dumais, S.T.: The vocabulary problem in human-system communication. Communications of the ACM, 30(11):964–971, November (1987).Google Scholar
  4. 4.
    Merkl, D., Exploration of Text Collections with Hierarchical Feature Maps (1997)Google Scholar
  5. 5.
    Rauber, A., Dittenbach, M., and Merkl, D., Automatically Detecting and Organizing Documents into Topic Hierarchies: A Neural Network Based Approach to Bookshelf Creation and Arrangement (2000)Google Scholar
  6. 6.
    Kohonen, T., Self-organized formation of topologically correct feature maps. Biological Cybernetics, 43:-69, 1982.CrossRefMathSciNetGoogle Scholar
  7. 7.
    Krista, L., Honkela, T., Kaski, S., and Kohonen, T., WEBSOM-A Status Report (1996)Google Scholar
  8. 8.
    Honkela, T., Pulkki, V., and Kohonen, T. (1995). Contextual relations of words in Grimm tales analyzed by self-organizing map. In Fogelman-Soulié, F. and Gallinari, P., editors, Proceedings of the International Conference on Artificial Neural Networks, ICANN-95, volume 2, pages 3–7, Paris. EC2 et Cie.Google Scholar
  9. 9.
    Kohonen, T., Kasaki., S., Langus., K., Salojärvi, J., Paatero., V. and Saarela, A. Self Organization of a Massive Document Collection. IEEE Transactions on Neural Networks for Data Mining and Knowledge Descovery, Volume 11(3), pp 574–585. (2000)Google Scholar
  10. 10.
    Blackmore, J., Miikkulainen, R.: Incremental grid growing: Encoding high-dimensional structure into a two-dimensional feature map. In Proc Int’l Conf Neural Networks (ICANN’93), San Francisco, CA, 1993.Google Scholar
  11. 11.
    Fritzke, B.: Growing grid-a self-organizing network with constant neighborhood range and adaption strength. Neural Processing Letters, 2, No. 5:1–5, (1995)CrossRefGoogle Scholar
  12. 12.
    Chen, H., Houston., A., Sewell, R., Scatz., B., Internet Browsing and Searching: User Evaluations of Category Map and Concept Space Techniques (1998)Google Scholar
  13. 13.
    Salton, G., Wong, A., and Yang, C., Vector space model for automatic indexing, Communications of the ACM 18, pp. 613–620, 1975.zbMATHCrossRefGoogle Scholar
  14. 14.
    Rauber, A., Merkl, D., Automatic Labeling of Self-Organizing Maps: Making a Treasure-Map Reveal its SecretsGoogle Scholar
  15. 15.
    Freeman, R., Yin, H., Allinson, N., Self-Organising Maps for Tree View Based Hierarchical Document Clustering, Proceedings of the International Joint Conference on Neural Networks (IJCNN’02), Honolulu, Hawaii, vol. 2, pp. 1906–1911, (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Ben Russell
    • 1
  • Hujun Yin
    • 1
  • Nigel M. Allinson
    • 1
  1. 1.Department of Electrical Engineering and ElectronicsUniversity of Manchester Institute of Science and Technology (UMIST)ManchesterUK

Personalised recommendations