Skip to main content

Batch-Learning Self-Organizing Map for Predicting Functions of Poorly-Characterized Proteins Massively Accumulated

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5629))

Abstract

As the result of the decoding of large numbers of genome sequences, numerous proteins whose functions cannot be identified by the homology search of amino acid sequences have accumulated and remain of no use to science and industry. Establishment of novel prediction methods for protein function is urgently needed. We previously developed Batch-Learning SOM (BL-SOM) for genome informatics; here, we developed BL-SOM to predict functions of proteins on the basis of similarity in oligopeptide composition of proteins. Oligopeptides are component parts of a protein and involved in formation of its functional motifs and structural parts. Concerning oligopeptide frequencies in 110,000 proteins classified into 2853 function-known COGs (clusters of orthologous groups), BL-SOM could faithfully reproduce the COG classifications, and therefore, proteins whose functions have been unidentified with homology searches could be related to function-known proteins. BL-SOM was applied to predict protein functions of large numbers of proteins obtained from metagenome analyses.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kohonen, T.: Self-organized formation of topologically correct feature maps. Biol. Cybern. 43, 59–69 (1982)

    Article  MathSciNet  MATH  Google Scholar 

  2. Kohonen, T.: The self-organizing map. Proc. IEEE 78, 1464–1480 (1990)

    Article  Google Scholar 

  3. Kohonen, T., Oja, E., Simula, O., Visa, A., Kangas, J.: Engineering applications of the self-organizing map. Proc. IEEE 84, 1358–1384 (1996)

    Article  Google Scholar 

  4. Ferran, E.A., Pflugfelder, B., Ferrara, P.: Self-organized neural maps of human protein sequences. Protein Sci. 3, 507–521 (1994)

    Article  Google Scholar 

  5. Kanaya, S., Kudo, Y., Abe, T., Okazaki, T., Carlos, D.C., Ikemura, T.: Gene classification by self-organization mapping of codon usage in bacteria with completely sequenced genome. Genome Inform. 9, 369–371 (1998)

    Google Scholar 

  6. Kanaya, S., Kinouchi, M., Abe, T., Kudo, Y., Yamada, Y., Nishi, T., Mori, H., Ikemura, T.: Analysis of codon usage diversity of bacterial genes with a self-organizing map (SOM): characterization of horizontally transferred genes with emphasis on the E. coli O157 genome. Gene. 276, 89–99 (2001)

    Article  Google Scholar 

  7. Abe, T., Kanaya, S., Kinouchi, M., Ichiba, Y., Kozuki, T., Ikemura, T.: A novel bioinformatic strategy for unveiling hidden genome signatures of eukaryotes: Self-organizing map of oligonucleotide frequency. Genome Inform. 13, 12–20 (2002)

    Google Scholar 

  8. Abe, T., Kanaya, S., Kinouchi, M., Ichiba, Y., Kozuki, T., Ikemura, T.: Informatics for unveiling hidden genome signatures. Genome Res. 13, 693–702 (2003)

    Article  Google Scholar 

  9. Abe, T., Kozuki, T., Kosaka, Y., Fukushima, S., Nakagawa, S., Ikemura, T.: Self-organizing map reveals sequence characteristics of 90 prokaryotic and eukaryotic genomes on a single map. In: WSOM 2003, pp. 95–100 (2003)

    Google Scholar 

  10. Abe, T., Sugawara, H., Kinouchi, M., Kanaya, S., Matsuura, Y., Tokutaka, H., Ikemura, T.: A large-scale Self-Organizing Map (SOM) constructed with the Earth Simulator unveils sequence characteristics of a wide range of eukaryotic genomes. In: WSOM 2005, pp. 187–194 (2005)

    Google Scholar 

  11. Abe, T., Sugawara, H., Kinouchi, M., Kanaya, S., Ikemura, T.: A large-scale Self-Organizing Map (SOM) unveils sequence characteristics of a wide range of eukaryote genomes. Gene. 365, 27–34 (2006)

    Article  Google Scholar 

  12. Abe, T., Sugawara, H., Kanaya, S., Ikemura, T.: Sequences from almost all prokaryotic, eukaryotic, and viral genomes available could be classified according to genomes on a large-scale Self-Organizing Map constructed with the Earth Simulator. J. Earth Simulator 6, 17–23 (2006)

    Google Scholar 

  13. Abe, T., Sugawara, H., Kinouchi, M., Kanaya, S., Ikemura, T.: Novel phylogenetic studies of genomic sequence fragments derived from uncultured microbe mixtures in environmental and clinical samples. DNA Res. 12, 281–290 (2005)

    Article  Google Scholar 

  14. Hayashi, H., Abe, T., Sakamoto, M., et al.: Direct cloning of genes encoding novel xylanases from human gut. Can. J. Microbiol. 51, 251–259 (2005)

    Article  Google Scholar 

  15. Uchiyama, T., Abe, T., Ikemura, T., Watanabe, K.: Substrate-induced gene-expression screening of environmental metagenome libraries for isolation of catabolic genes. Nature Biotech. 23, 88–93 (2005)

    Article  Google Scholar 

  16. Abe, T., Sugawara, H., Kanaya, S., Ikemura, T.: A novel bioinformatics tool for phylogenetic classification of genomic sequence fragments derived from mixed genomes of environmental uncultured microbes. Polar Bioscience 20, 103–112 (2006)

    Google Scholar 

  17. Tatsusov, R.L., Koonin, E.V., Lipman, D.J.: A genomic perspective on protein families. Science 278, 631–637 (1997)

    Article  Google Scholar 

  18. Amann, R.I., Ludwig, W., Schleifer, K.H.: Phylogenetic identification and in situ detection of individual microbial cells without cultivation. Microbiol. Rev. 59, 143–169 (1995)

    Google Scholar 

  19. Hugenholtz, P., Pace, N.R.: Identifying microbial diversity in the natural environment: a molecular phylogenetic approach. Trends Biotechnol. 14, 190–197 (1996)

    Article  Google Scholar 

  20. Rondon, M.R., August, P.R., Bettermann, A.D., et al.: Cloning the soil metagenome: a strategy for accessing the genetic and functional diversity of uncultured microorganisms. Appl. Environ. Microbiol. 66, 2541–2547 (2000)

    Article  Google Scholar 

  21. Venter, J.C., et al.: Environmental genome shotgun sequencing of the Sargasso Sea. Science 304, 66–74 (2004)

    Article  Google Scholar 

  22. Abe, T., Ikemura, T.: A large-scale batch-learning Self-Organizing Maps for function prediction of poorly characterized proteins progressively accumulating in sequence databases. Annual Report of the Earth Simulator, April 2006 - March 2007, pp. 247–251 (2007)

    Google Scholar 

  23. Abe, T., Ikemura, T.: A large-scale genomics and proteomics analyses conducted by the Earth Simulator. Annual Report of the Earth Simulator, April 2007 - March 2008, pp. 245–249 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Abe, T., Kanaya, S., Ikemura, T. (2009). Batch-Learning Self-Organizing Map for Predicting Functions of Poorly-Characterized Proteins Massively Accumulated. In: Príncipe, J.C., Miikkulainen, R. (eds) Advances in Self-Organizing Maps. WSOM 2009. Lecture Notes in Computer Science, vol 5629. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02397-2_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-02397-2_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-02396-5

  • Online ISBN: 978-3-642-02397-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics