Advertisement

Self-Organising Maps for Hierarchical Tree View Document Clustering Using Contextual Information

  • Richard Freeman
  • Hujun Yin
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2412)

Abstract

In this paper we propose an effective method to cluster documents into a dynamically built taxonomy of topics, directly extracted from the documents. We take into account short contextual information within the text corpus, which is weighted by importance and used as input to a set of independently spun growing Self-Organising Maps (SOM). This work shows an increase in precision and labelling quality in the hierarchy of topics, using these indexing units. The use of the tree structure over sets of conventional two-dimensional maps creates topic hierarchies that are easy to browse and understand, in which the documents are stored based on their content similarity.

Keywords

Feature Selection Random Projection Document Cluster Text Corpus Weighting Bias 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Salton, G., Automatic text processing: the transformation, analysis, and retrieval of information by Computer, Reading, Mass.Wokingham: Addison-Wesley 1988.Google Scholar
  2. 2.
    Maarek, Y.S., Fagin, R., Ben-Shaul, I.Z., Pelleg, D., Ephemeral document clustering for web applications, IBM Research Report RJ 10186, April, 2000.Google Scholar
  3. 3.
    Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., and Harshman, R., Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6), pp. 391–407, 1990.CrossRefGoogle Scholar
  4. 4.
    Honkela, T., WEBSOM Self-Organizing Maps of Document Collections, Proceedings of WSOM’97, Workshop on Self-Organizing Maps, Espoo, Finland, June 4–6, 1997.Google Scholar
  5. 5.
    Kohonen, T., Kaski, S., Lagus, K., Salojrvi, J., Paatero, V., Saarela, A., Self Organization of a Massive Document Collection. IEEE Transactions on Neural Networks, Special Issue on Neural Networks for Data Mining and Knowledge Discovery, vol. 11, n. 3, pp. 574–585, May, 2000.Google Scholar
  6. 6.
    Miikkulainen, R., Script recognition with hierarchical feature maps. Connection Science, 2(1&2), pp. 83–101, 1990.CrossRefGoogle Scholar
  7. 7.
    Alahakoon, D., Halgamuge, S.K., Srinivasan, B., Dynamic self organizing maps with controlled growth for knowledge discovery, IEEE Transactions on Neural Networks, vol. 11,pp. 601–614, 2000.CrossRefGoogle Scholar
  8. 8.
    Dittenbach, M., Merkl, D., Rauber, A., The Growing Hierarchical Self-Organizing Map, Proceedings of the International Joint Conference on Neural Networks (IJCNN 2000), vol. 6, pp. 15–19, July 24–27, 2000.Google Scholar
  9. 9.
    Freeman, R., Yin, H., Allinson, N. M., Self-Organising Maps for Tree View Based Hierarchical Document Clustering, Proceedings of the International Joint Conference on Neural Networks (IJCNN’02), vol. 2, pp. 1906–1911, Honolulu, Hawaii, 12–17 May, 2002.Google Scholar
  10. 10.
    Martinetz, T.M., Berkovich, S.G., Schulten, K.J., “Neural-Gas” Network for Vector Quantization and its Application to Time-Series Prediction, IEEE Transactions on Neural Networks, Vol. 4, No. 4, pp. 558–569, July, 1993.CrossRefGoogle Scholar
  11. 11.
    Yin, H., Allinson, N.M., Interpolating self-organising maps (iSOM), Electronics Letters, Vol. 35, No. 19, pp. 1649–1650, 1999.CrossRefGoogle Scholar
  12. 12.
    Yin, H., ViSOM-A novel method for multivariate data projection and structure visualisation, in IEEE Transactions on Neural Networks, Vol. 13, No. 1, 2002.Google Scholar
  13. 13.
    Pullwitt, D., Der, R., Integrating Contextual Information into Text Document clustering with Self-Organizing Maps, in Advances in Self-Organising Maps, N. Allinson, H. Yin, L. Allinson, J. Slack (Eds.), Springer, pp. 54–60, 2001.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Richard Freeman
    • 1
  • Hujun Yin
    • 1
  1. 1.Department of Electrical Engineering and ElectronicsUniversity of Manchester Institute of Science and Technology (UMIST)ManchesterUK

Personalised recommendations