Skip to main content

Self-Organizing Map and Other Clustering Methods in Transcriptomics

  • Chapter
  • First Online:
Bioinformatics and the Cell
  • 2432 Accesses

Abstract

Self-organizing map (SOM) is an artificial neural network algorithm, having been used frequently with transcriptomic data analysis, in particular for clustering co-expressed genes as a basis to infer co-regulated genes. It can be applied to any set of objects as long as a distance function can be defined between objects. SOM is numerically illustrated together with a simple UPGMA method to contrast between the two. A less known application of SOM is in discovering heterogeneous motifs present in a set of sequences, making it more general than Gibbs sampler in de novo motif discovery. These two approaches, one with a (gene × expression) matrix as input and the other with a set of sequences as input (where each sequence may contain multiple but heterogeneous protein-binding sites), are illustrated.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Bickel DR (2003) Robust cluster analysis of microarray gene expression data with the number of clusters determined biologically. Bioinformatics 19(7):818–824

    Article  CAS  PubMed  Google Scholar 

  • Chen JJ, Peck K, Hong TM, Yang SC, Sher YP, Shih JY, Wu R, Cheng JL, Roffler SR, Wu CW et al (2001) Global analysis of gene expression in invasion by a lung cancer model. Cancer Res 61(13):5223–5230

    PubMed  CAS  Google Scholar 

  • Chilingaryan A, Gevorgyan N, Vardanyan A, Jones D, Szabo A (2002) Multivariate approach for selecting sets of differentially expressed genes. Math Biosci 176(1):59–69

    Article  CAS  PubMed  Google Scholar 

  • Cho RJ, Campbell MJ, Winzeler EA, Steinmetz L, Conway A, Wodicka L, Wolfsberg TG, Gabrielian AE, Landsman D, Lockhart DJ et al (1998) A genome-wide transcriptional analysis of the mitotic cell cycle. Mol Cell 2(1):65–73

    Article  CAS  PubMed  Google Scholar 

  • Covell DG, Wallqvist A, Rabow AA, Thanki N (2003) Molecular classification of cancer: unsupervised self-organizing map analysis of gene expression microarray data. Mol Cancer Ther 2(3):317–332

    PubMed  CAS  Google Scholar 

  • Eisen MB, Spellman PT, Brown PO, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A 95(25):14863–14868

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Hartigan JA (1975) Clustering algorithms. Wiley, New York

    Google Scholar 

  • Kim DW, Lee KH, Lee D (2005) Detecting clusters of different geometrical shapes in microarray gene expression data. Bioinformatics 21(9):1927–1934

    Article  CAS  PubMed  Google Scholar 

  • Kohonen T (2001) Self-organizing maps. Springer, Berlin

    Book  Google Scholar 

  • Lamendola DE, Duan Z, Yusuf RZ, Seiden MV (2003) Molecular description of evolving paclitaxel resistance in the SKOV-3 human ovarian carcinoma cell line. Cancer Res 63(9):2200–2205

    PubMed  CAS  Google Scholar 

  • Murtagh F (1984) Complexities of hierarchic clustering algorithms: state of the art. Comput Stat Q 1:101–113

    Google Scholar 

  • Ordway JM, Fenster SD, Ruan H, Curran T (2005) A transcriptome map of cellular transformation by the fos oncogene. Mol Cancer 4(1):19

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Pielou EC (1984) The interpretation of ecological data: a primer on classification and ordination. Wiley, New York

    Google Scholar 

  • Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4:406–425

    PubMed  CAS  Google Scholar 

  • Sawa T, Ohno-Machado L (2003) A neural network-based similarity index for clustering DNA microarray data. Comput Biol Med 33(1):1–15

    Article  CAS  PubMed  Google Scholar 

  • Seo EY, Namkung JH, Lee KM, Lee WH, Im M, Kee SH, Tae Park G, Yang JM, Seo YJ, Park JK et al (2005) Analysis of calcium-inducible genes in keratinocytes using suppression subtractive hybridization and cDNA microarray. Genomics 86(5):528–538

    Article  CAS  PubMed  Google Scholar 

  • Sneath PHA (1962) The construction of taxonomic groups. In: Ainsworth GC, Sneath PHA (eds) Microbial classification. Cambridge University Press, Cambridge, pp 289–332

    Google Scholar 

  • Sokal RR, Michener CD (1958) A statistical method for evaluating systematic relationships. Univ Kans Sci Bull 28:1409–1438

    Google Scholar 

  • Toronen P, Kolehmainen M, Wong G, Castren E (1999) Analysis of gene expression data using self-organizing maps. FEBS Lett 451(2):142–146

    Article  CAS  PubMed  Google Scholar 

  • Trutschl M, Dinkova TD, Rhoads RE (2005) Application of machine learning and visualization of heterogeneous datasets to uncover relationships between translation and developmental stage expression of C. elegans mRNAs. Physiol Genomics 21(2):264–273

    Article  CAS  PubMed  Google Scholar 

  • Wang J, Delabie J, Aasheim H, Smeland E, Myklebost O (2002) Clustering of the SOM easily reveals distinct gene expression patterns: results of a reanalysis of lymphoma study. BMC Bioinform 3:36

    Article  Google Scholar 

  • Xia X (2017d) Self-organizing map for characterizing heterogeneous nucleotide and amino acid sequence motifs. Computation 5(4):43

    Article  Google Scholar 

  • Xia X, Xie Z (2001a) AMADA: analysis of microarray data. Bioinformatics 17:569–570

    Article  CAS  PubMed  Google Scholar 

  • Xiao L, Wang K, Teng Y, Zhang J (2003) Component plane presentation integrated self-organizing map for microarray data analysis. FEBS Lett 538(1–3):117–124

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Postscript

Postscript

A researcher compiles demographic, political, economic, and educational data from many countries in the world and used clustering algorithms and self-organizing map to analyze them. Almost all affluent western countries were mapped to a few closely spaced nodes whereas all poor countries were scattered all over the place.

“All happy families are alike; each unhappy family is unhappy in its own way.” The researcher concluded his presentation with a quote from Leo Tolstoy, highlighting the sharing of democracy among the affluent western countries.

I was impressed, but then the ensuing discussion became disturbing, at least to me, when someone expressed the perhaps noble wish that “It would be so nice if all those poor countries embrace democracy and live like us.”

Spreading democracy and changing regimes have been used as a pretext for wars in recent years, often resulting in millions of homeless refugees.

Have we really developed a social system that can be grafted onto another country and spawn prosperity and happiness?

We as scientists often do our research in different ways, although we all believe in the general principle of scientific method. I surely would pay attention to how successful scientists conduct their research and imitate what they do if it benefits my own research, but I would be appalled if someone walks into my laboratory and demands that I have to do research in his or her way.

Human history has witnessed many wars that erupted because some people thought that they had gained a religion better than others. There are still fundamentalists who believe that the world will become heaven if everyone embraces their extreme views.

During the Great Cultural Revolution in China in late 1960s, young red guards heard, mistakenly, that serfdom was still practiced in Tibet and that the ruling monks maintained such serfdom by brainwashing the believers. Committing themselves to the noble cause of liberating the poor Tibetans, many red guards braved themselves against all the odds to march thousands of miles of treacherous terrains to Tibet. Many young boys and girls died along the way, taking their last breath to bid their comrades to continue their unfinished cause. Those who did reach Potala Palace immediately began to do cultural damage that is felt even today.

Plato believed that arrogance is the root cause of all misunderstanding and evil and illustrated his point brilliantly with his famous allegory of the cave. But we still live like the chained prisoners in the cave. We will not make progress unless we realize how ignorant and depraved we are.

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media LLC

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Xia, X. (2018). Self-Organizing Map and Other Clustering Methods in Transcriptomics. In: Bioinformatics and the Cell. Springer, Cham. https://doi.org/10.1007/978-3-319-90684-3_6

Download citation

Publish with us

Policies and ethics