Skip to main content

Median Topographic Maps for Biomedical Data Sets

  • Chapter

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5400))

Abstract

Median clustering extends popular neural data analysis methods such as the self-organizing map or neural gas to general data structures given by a dissimilarity matrix only. This offers flexible and robust global data inspection methods which are particularly suited for a variety of data as occurs in biomedical domains. In this chapter, we give an overview about median clustering and its properties and extensions, with a particular focus on efficient implementations adapted to large scale data analysis.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Al-Harbi, S., Rayward-Smith, V.: The use of a supervised k-means algorithm on real-valued data with applications in health. In: Chung, P.W.H., Hinde, C.J., Ali, M. (eds.) IEA/AIE 2003. LNCS, vol. 2718, pp. 575–581. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  2. Alex, N., Hammer, B.: Parallelizing single pass patch clustering. In: Verleysen, M. (ed.) ESANN 2008, pp. 227–232 (2008)

    Google Scholar 

  3. Alex, N., Hammer, B., Klawonn, F.: Single pass clustering for large data sets. In: Proceedings of 6th International Workshop on Self-Organizing Maps (WSOM 2007), Bielefeld, Germany, September 3-6 (2007)

    Google Scholar 

  4. Ambroise, C., Govaert, G.: Analyzing dissimilarity matrices via Kohonen maps. In: Proceedings of 5th Conference of the International Federation of Classification Societies (IFCS 1996), Kobe (Japan), March 1996, vol. 2, pp. 96–99 (1996)

    Google Scholar 

  5. Anderson, E.: The irises of the gaspe peninsula. Bulletin of the American Iris Society 59, 25 (1935)

    Google Scholar 

  6. Arora, S., Raghavan, P., Rao, S.: Approximation schemes for euclidean k-medians and related problems. In: Proceedings of the 30th Annual ACM Symposium on Theory of Computing, pp. 106–113 (1998)

    Google Scholar 

  7. Barreto, G.A.: Time series prediction with the self-organizing map: A review. In: Hammer, B., Hitzler, P. (eds.) Perspectives on Neural-Symbolic Integration. Springer, Heidelberg (2007)

    Google Scholar 

  8. Boulet, R., Jouve, B., Rossi, F., Villa, N.: Batch kernel som and related laplacian methods for social network analysis. In: Neurocomputing (2008) (to be published)

    Google Scholar 

  9. Celeux, G., Diday, E., Govaert, G., Lechevallier, Y., Ralambondrainy, H.: Classification Automatique des Données. Bordas, Paris (1989)

    Google Scholar 

  10. Charikar, M., Guha, S., Tardos, A., Shmoys, D.B.: A constant-factor approcimation algorithm for the k-median problem. Journal of Computer and System Sciences 65, 129 (2002)

    Article  Google Scholar 

  11. Conan-Guez, B., Rossi, F.: Speeding up the dissimilarity self-organizing maps by branch and bound. In: Sandoval, F., Prieto, A.G., Cabestany, J., Graña, M. (eds.) IWANN 2007. LNCS, vol. 4507, pp. 203–210. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  12. Conan-Guez, B., Rossi, F., El Golli, A.: Fast algorithm and implementation of dissimilarity self-organizing maps. Neural Networks 19(6-7), 855–863 (2006)

    Article  PubMed  Google Scholar 

  13. Cottrell, M., Hammer, B., Hasenfuss, A., Villmann, T.: Batch and median neural gas. Neural Networks 19, 762–771 (2006)

    Article  PubMed  Google Scholar 

  14. Farnstrom, F., Lewis, J., Elkan, C.: Scalability for clustering algorithms revisited. SIGKDD Explorations 2(1), 51–57 (2000)

    Article  Google Scholar 

  15. Fisher, R.A.: The use of multiple measurements in axonomic problems. Annals of Eugenics 7, 179–188 (1936)

    Article  Google Scholar 

  16. Fort, J.-C., Letrémy, P., Cottrell, M.: Advantages and drawbacks of the batch kohonen algorithm. In: Verleysen, M. (ed.) ESANN 2002, pp. 223–230. D Facto (2002)

    Google Scholar 

  17. Frey, B., Dueck, D.: Clustering by passing messages between data points. Science 315, 972–977 (2007)

    Article  CAS  PubMed  Google Scholar 

  18. Frey, B., Dueck, D.: Response to clustering by passing messages between data points. Science 319, 726d (2008)

    Article  CAS  Google Scholar 

  19. Graepel, T., Herbrich, R., Bollmann-Sdorra, P., Obermayer, K.: Classification on pairwise proximity data. In: NIPS, vol. 11, pp. 438–444. MIT Press, Cambridge (1999)

    Google Scholar 

  20. Graepel, T., Obermayer, K.: A stochastic self-organizing map for proximity data. Neural Computation 11, 139–155 (1999)

    Article  CAS  PubMed  Google Scholar 

  21. Guha, S., Mishra, N., Motwani, R., O’Callaghan, L.: Clustering data streams. In: IEEE Symposium on Foundations of Computer Science, pp. 359–366 (2000)

    Google Scholar 

  22. Guha, S., Rastogi, R., Shim, K.: Cure: an efficient clustering algorithm for large datasets. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 73–84 (1998)

    Google Scholar 

  23. Haasdonk, B., Bahlmann, C.: Learning with distance substitution kernels. In: Pattern Recognition - Proc. of the 26th DAGM Symposium (2004)

    Google Scholar 

  24. Hammer, B., Hasenfuss, A.: Relational neural gas. In: Hertzberg, J., Beetz, M., Englert, R. (eds.) KI 2007. LNCS, vol. 4667, pp. 190–204. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  25. Hammer, B., Jain, B.J.: Neural methods for non-standard data. In: Verleysen, M. (ed.) European Symposium on Artificial Neural Networks 2004, pp. 281–292. D-side publications (2004)

    Google Scholar 

  26. Hammer, B., Micheli, A., Sperduti, A., Strickert, M.: Recursive self-organizing network models. Neural Networks 17(8-9), 1061–1086 (2004)

    Article  PubMed  Google Scholar 

  27. Hammer, B., Villmann, T.: Classification using non standard metrics. In: Verleysen, M. (ed.) ESANN 2005, pp. 303–316. d-side publishing (2005)

    Google Scholar 

  28. Hansen, P., Mladenovic, M.: Todo. Location Science 5, 207 (1997)

    Article  Google Scholar 

  29. Hasenfuss, A., Hammer, B.: Single pass clustering and classification of large dissimilarity datasets. In: AIPR (2008)

    Google Scholar 

  30. Hathaway, R.J., Bezdek, J.C.: Nerf c-means: Non-euclidean relational fuzzy clustering. Pattern Recognition 27(3), 429–437 (1994)

    Article  Google Scholar 

  31. Hathaway, R.J., Davenport, J.W., Bezdek, J.C.: Relational duals of the c-means algorithms. Pattern Recognition 22, 205–212 (1989)

    Article  Google Scholar 

  32. Heskes, T.: Self-organizing maps, vector quantization, and mixture modeling. IEEE Transactions on Neural Networks 12, 1299–1305 (2001)

    Article  CAS  PubMed  Google Scholar 

  33. Hofmann, T., Buhmann, J.M.: Pairwise data clustering by deterministic annealing. IEEE Transactions on Pattern Analysis and Machine Intelligence 19(1), 1–14 (1997)

    Article  Google Scholar 

  34. Jin, R., Goswami, A., Agrawal, G.: Fast and exact out-of-core and distributed k-means clustering. Knowledge and Information System 1, 17–40 (2006)

    Article  Google Scholar 

  35. Juan, A., Vidal, E.: On the use of normalized edit distances and an efficient k-nn search technique (k-aesa) for fast and accurate string classification. In: ICPR 2000, vol. 2, pp. 680–683 (2000)

    Google Scholar 

  36. Kaski, S., Nikkilä, J., Oja, M., Venna, J., Törönen, P., Castren, E.: Trustworthiness and metrics in visualizing similarity of gene expression. BMC Bioinformatics 4 (2003)

    Google Scholar 

  37. Kaski, S., Nikkilä, J., Savia, E., Roos, C.: Discriminative clustering of yeast stress response. In: Seiffert, U., Jain, L., Schweizer, P. (eds.) Bioinformatics using Computational Intelligence Paradigms, pp. 75–92. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  38. Kaufman, L., Rousseeuw, P.J.: Clustering by means of medoids. In: Dodge, Y. (ed.) Statistical Data Analysis Based on the L1-Norm and Related Methods, pp. 405–416. North-Holland, Amsterdam (1987)

    Google Scholar 

  39. Kohonen, T.: Self-Organizing Maps. Springer, Heidelberg (1995)

    Book  Google Scholar 

  40. Kohonen, T.: Self-organizing maps of symbol strings. Technical report A42, Laboratory of computer and information science, Helsinki University of technology, Finland (1996)

    Google Scholar 

  41. Kohonen, T., Somervuo, P.: How to make large self-organizing maps for nonvectorial data. Neural Networks 15, 945–952 (2002)

    Article  PubMed  Google Scholar 

  42. Land, A.H., Doig, A.G.: An automatic method for solving discrete programming problems. Econometrica 28, 497–520 (1960)

    Article  Google Scholar 

  43. Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions and reversals. Sov. Phys. Dokl. 6, 707–710 (1966)

    Google Scholar 

  44. Lu, Y., Lu, S., Fotouhi, F., Deng, Y., Brown, S.: Incremental genetic k-means algorithm and its application in gene expression data analysis. BMC Bioinformatics 5, 172 (2004)

    Article  PubMed  PubMed Central  Google Scholar 

  45. Lundsteen, C., Phillip, J., Granum, E.: Quantitative analysis of 6985 digitized trypsin G-banded human metaphase chromosomes. Clinical Genetics 18, 355–370 (1980)

    Article  CAS  PubMed  Google Scholar 

  46. Martinetz, T., Berkovich, S., Schulten, K.: ‘neural-gas’ network for vector quantization and its application to time-series prediction. IEEE Transactions on Neural Networks 4, 558–569 (1993)

    Article  CAS  PubMed  Google Scholar 

  47. Martinetz, T., Schulten, K.: Topology representing networks. Neural Networks 7(507-522) (1994)

    Google Scholar 

  48. Mevissen, H., Vingron, M.: Quantifying the local reliability of a sequence alignment. Protein Engineering 9, 127–132 (1996)

    Article  CAS  PubMed  Google Scholar 

  49. Neuhaus, M., Bunke, H.: Edit distance-based kernel functions for structural pattern classification. Pattern Recognition 39(10), 1852–1863 (2006)

    Article  Google Scholar 

  50. Bradley, P.S., Fayyad, U., Reina, C.: Scaling clustering algorithms to large data sets. In: Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, pp. 9–15. AAAI Press, Menlo Park (1998)

    Google Scholar 

  51. Qin, A.K., Suganthan, P.N.: Kernel neural gas algorithms with application to cluster analysis. In: ICPR 2004, vol. 4, pp. 617–620 (2004)

    Google Scholar 

  52. Rossi, F.: Model collisions in the dissimilarity SOM. In: Proceedings of XVth European Symposium on Artificial Neural Networks (ESANN 2007), Bruges (Belgium), pp. 25–30 (April 2007)

    Google Scholar 

  53. Shamir, R., Sharan, R.: Approaches to clustering gene expression data. In: Jiang, T., Smith, T., Xu, Y., Zhang, M.Q. (eds.) Current Topics in Computational Biology. MIT Press, Cambridge (2001)

    Google Scholar 

  54. Villmann, T., Seiffert, U., Schleif, F.-M., Brüß, C., Geweniger, T., Hammer, B.: Fuzzy labeled self-organizing map with label-adjusted prototypes. In: Schwenker, F., Marinai, S. (eds.) ANNPR 2006. LNCS, vol. 4087, pp. 46–56. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  55. Wang, W., Yang, J., Muntz, R.: Sting: a statistical information grid approach to spatial data mining. In: Proceedings of the 23rd VLDB Conference, pp. 186–195 (1997)

    Google Scholar 

  56. Wolberg, W., Street, W., Heisey, D., Mangasarian, O.: Computer-derived nuclear features distinguish malignant from benign breast cytology. Human Pathology 26, 792–796 (1995)

    Article  CAS  PubMed  Google Scholar 

  57. Yang, Q., Wu, X.: 10 challenging problems in data mining research. International Journal of Information Technology & Decision Making 5(4), 597–604 (2006)

    Article  Google Scholar 

  58. Zhang, T., Ramakrishnan, R., Livny, M.: Birch: an efficient data clustering method for very large databases. In: Proceedings of the 15th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Databas Systems, pp. 103–114 (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Hammer, B., Hasenfuss, A., Rossi, F. (2009). Median Topographic Maps for Biomedical Data Sets. In: Biehl, M., Hammer, B., Verleysen, M., Villmann, T. (eds) Similarity-Based Clustering. Lecture Notes in Computer Science(), vol 5400. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01805-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-01805-3_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-01804-6

  • Online ISBN: 978-3-642-01805-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics