Skip to main content

An Extension of Self-organizing Maps to Categorical Data

  • Conference paper
Progress in Artificial Intelligence (EPIA 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3808))

Included in the following conference series:

Abstract

Self-organizing maps (SOM) have been recognized as a powerful tool in data exploratoration, especially for the tasks of clustering on high dimensional data. However, clustering on categorical data is still a challenge for SOM. This paper aims to extend standard SOM to handle feature values of categorical type. A batch SOM algorithm (NCSOM) is presented concerning the dissimilarity measure and update method of map evolution for both numeric and categorical features simultaneously.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agresti, A.: Categorical data analysis. Wiley series in probability and mathematical statistics. John Wiley & Sons, New York (1990)

    MATH  Google Scholar 

  2. Ding, Q., Canton, M., Diaz, D., Zou, Q., Lu, B., et al.: Data mining survey, http://midas.cs.ndsu.nodak.edu/~ding/

  3. Flexer, A.: On use of self-organizing maps for clustering and visualization. In: Żytkow, J.M., Rauch, J. (eds.) PKDD 1999. LNCS (LNAI), vol. 1704, pp. 80–88. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  4. Ganti, V., Gehrke, J., Ramakrishnan, R.: CACTUS-clustering categorical data using summaries. Knowledge Discovery and Data Mining, 73–83 (1999)

    Google Scholar 

  5. Guha, S., Rastogi, R., Shim, K.: ROCK: a robust clustering algorithm for categorical attributes. Information Systems 25(5), 345–366 (2000)

    Article  Google Scholar 

  6. Huang, Z.: Clustering large data sets with mixed numeric and categorical values. In: Hongjun, L., Hiroshi, M., Huan, L. (eds.) Proceedings of the 1st Pacific-Asia Conference on Knowledge Discovery & Data Mining, pp. 21–34. World Scientific, Singapore (1997)

    Google Scholar 

  7. Huang, Z.: Extensions to the k-means algorithms for clustering large data sets with categorical values. Data Mining and Knowledge Discovery 2, 283–304 (1998)

    Article  Google Scholar 

  8. Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Computering Survey 31(3), 264–323 (1999)

    Article  Google Scholar 

  9. Kohonen, T.: Self-organizing maps, 2nd edn. Springer, Berlin (1997)

    MATH  Google Scholar 

  10. Kohonen, T., Hynninen, J., Kangas, J., Laaksonen, J.: SOM PAK: the Self-Organizing Map program package. Report A31, Helsinki University of Technology, Laboratory of Computer and Information Science (1996)

    Google Scholar 

  11. Leisch, F., Weingessel, A., et al.: Competitive learning for binary valued data. In: International Conference on Artifcial Neural Networks, Skoevde, Sweeden. Springer, Heidelberg

    Google Scholar 

  12. Lourenco, F., Lobo, V., Bacao, F.: Binary-based similarity measures for categorical data and their application in self-organizing maps. In: JOCLAD 2004 - XI Jornadas de Classificacao e Anlise de Dados, April 1-3 (2004)

    Google Scholar 

  13. Blake, C.L., Merz, C.J.: UCI Repository of machine learning databases. University of California, Department of Information and Computer Science UCI Machine Learning Repository (1998)

    Google Scholar 

  14. Munro, R.: Classification and analysis in supervised mixture-modelling. University of Sydney, technical report 536

    Google Scholar 

  15. Marques, N., Chen, N.: Border Detection on Remote Sensing Satellite Data Using Self-Organizing Maps. In: Pires, F.M., Abreu, S. (eds.) EPIA 2003-11th Portuguese Conference on Artificial Intelligence, 4th International Workshop on Extraction of Knowledge from Databases (EKDB 2003), pp. 294–307. Springer, Beja (2003)

    Google Scholar 

  16. Laboratory of computer and information sciences & Neural networks research center, Helsinki University of Technology: SOM Toolbox 2.0

    Google Scholar 

  17. Talavera, L., Bejar, J.: Integrating declarative knowledge in hierarchical clustering tasks. In: Proceedings of the international symposium on intelligent data anlysis, pp. 211–222. Springer, Amsterdam

    Google Scholar 

  18. Vesanto, J.: Data mining techniques based on the self-organizing map. M.S. Thesis (1997)

    Google Scholar 

  19. Vesanto, J., Himberg, J., Alhoniemi, E., Parhankangas, J.: Self-organizing map in matlab: the SOM toolbox. In: Proceedings of the Matlab DSP Conference, Espoo, Finland, pp. 35–40 (1999)

    Google Scholar 

  20. Wagstaff, K., Cardie, C.: Clustering with instance-level constraints. In: Proceedings of the 7th International Conference on Machine Learning, pp. 1103–1110 (2000)

    Google Scholar 

  21. Zhang, X., Li, Y.: Self-organizing map as a new method for clustering and data analysis. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 2448–2451 (1993)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chen, N., Marques, N.C. (2005). An Extension of Self-organizing Maps to Categorical Data. In: Bento, C., Cardoso, A., Dias, G. (eds) Progress in Artificial Intelligence. EPIA 2005. Lecture Notes in Computer Science(), vol 3808. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11595014_31

Download citation

  • DOI: https://doi.org/10.1007/11595014_31

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-30737-2

  • Online ISBN: 978-3-540-31646-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics