Skip to main content

Orphan Protein Function and Its Relation to Glycosylation

  • Conference paper
Bioinformatics and Genome Analysis

Part of the book series: Ernst Schering Research Foundation Workshop ((SCHERING FOUND,volume 38))

Abstract

Since the first bacterial genomes were completely sequenced, the surge in genome sequence data has overwhelmed the scientific community’s efforts towards elucidating protein function. Computational methods have made it possible to work with sequences from complete genomes and proteomes, and inference of protein function by exploiting direct sequence similarity indeed goes a long way in describing a proteome’s functional capacity. However, at least 40% of the gene products in newly sequenced genomes typically remain uncharacterised. Proteins without an annotated function are also known as orphan proteins since they do not belong to a functionally characterised protein family. Many sequences must, therefore, be compared using their features rather than by direct comparison in the conventional sequence space. Here we focus on one such feature — glycosylation — that is common in eukaryotic proteomes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Apweiler R, Hermjakob H, Sharon N (1999) On the frequency of protein glycosylation, as deduced from analysis of the SWISS-PROT database. Biochim Biophys Acta 1473: 4–8

    Article  PubMed  CAS  Google Scholar 

  • Ashburner M, Ball C, Blake J, Botstein D, Butler H, Cherry J, Davis A, Dolinski K, Dwight S, Eppig J, Harris M, Hill D, Issel Tarver L, Kasarskis A, Lewis S, Matese J, Richardson J, Ringwald M, Rubin G, Sherlock G (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25: 25–29

    Google Scholar 

  • Attwood T (2000) The quest to deduce protein function from sequence: the role of pattern databases. Int J Biochem Cell Biol 32: 139–155

    Article  PubMed  CAS  Google Scholar 

  • Blom N, Gammeltoft S, Brunak S (1999) Sequence and structure-based prediction of eukaryotic protein phosphorylation sites. J Mol Biol 294: 1351–1362

    Article  PubMed  CAS  Google Scholar 

  • Bork P, Dandekar T, Diaz Lazcoz Y, Eisenhaber F, Huynen M, Yuan Y (1998) Predicting function: from genes to genomes and back. J Mol Biol 283: 707–725

    Article  PubMed  CAS  Google Scholar 

  • Brown M, Grundy W, Lin D, Cristianini N, Sugnet C, Furey T, Ares Jr M, Haussier D (2000) Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc Natl Acad Sci USA 97: 262–267

    Article  PubMed  CAS  Google Scholar 

  • Casari G, Ouzounis C, Valencia A, Sander C (1996) Genequiz-H: Automatic function assignment for genome sequence analysis. In: Hunter L, Klein T (eds) Proceedings of the First Annual Pacific Symposium on Biocomputing. World Scientific, Hawaii, pp 707–709

    Google Scholar 

  • Chen C, Colley K (2000) Minimal structural and glycosylation requirements for Gal I activity and traficking. Glycobiology 10: 531–583

    Article  PubMed  CAS  Google Scholar 

  • Cohen P (2000) The regulation of protein function by multisite phosphorylation — a 25 year update. Trends Biochem Sci 25: 596–601

    Article  PubMed  CAS  Google Scholar 

  • Comer F, Hart G (1999) O-G1cNAc and the control of gene expression. Biochim Biophys Acta 1473: 161–171

    Article  PubMed  CAS  Google Scholar 

  • Corner F, Hart G (2000) 0-Glycosylation of nuclear and cytosolic proteins: dynamic interplay between O-G1cNAc and O-Phosphate. J Biol Chem 275: 29179–29182

    Google Scholar 

  • Dandekar T, Snel B, Huynen M, Bork P (1998) Conservation of gene order: a fingerprint of proteins that physically interact. Trends Biochem Sci 23: 324–328

    Article  PubMed  CAS  Google Scholar 

  • Devos D, Valencia A (2000) Practical limits of function prediction. Proteins 41: 98–107

    Article  PubMed  CAS  Google Scholar 

  • Eisen M, Spellman P, Brown P, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA 95: 14863–14868

    Article  PubMed  CAS  Google Scholar 

  • Eisenberg D, Marcotte E, Xenarios I, Yeates T (2000) Protein function in the post-genomic era. Nature 405: 823–826

    Article  PubMed  CAS  Google Scholar 

  • Eisenhaber B, Bork P, Eisenhaber F (1999) Prediction of potential GPI-modification sites in proprotein sequences. J Mol Biol 292: 741–758

    Article  PubMed  CAS  Google Scholar 

  • Enright A, Iliopoulos I, Kyrpides N, Ouzounis C (1999) Protein interaction maps for complete genomes based on gene fusion events. Nature 402: 86–90

    Article  PubMed  CAS  Google Scholar 

  • Gupta R, Birch H, Rapacki K, Brunak S, Hansen J (1999a) O-GLYCBASE version 4.0: a revised database of 0-glycosylated proteins. Nucleic Acids Res 27: 370–372

    Article  PubMed  CAS  Google Scholar 

  • Gupta R, Jung E, Gooley A, Williams K, Brunak S, Hansen J (1999b) Scanning the available Dictyostelium discoideum proteome for O-linked GIcNAc glycosylation sites using neural networks. Glycobiology 9: 1009–1022

    Article  PubMed  CAS  Google Scholar 

  • Hanover J (2001) Glycan-dependent signaling: 0-linked N-acetylglucosamine. FASEB J 15: 1865–1876

    Article  PubMed  CAS  Google Scholar 

  • Hansen JE, Lund O, Engelbrecht J, Bohr H, Nielsen JO, Hansen JES, Brunak S (1995) Prediction of 0-glycosylation of mammalian proteins: specificity patterns of UDP- Ga1NAc:polypeptide N-acetylgalactosaminyltransferase. Biochem J 308: 801–813

    PubMed  CAS  Google Scholar 

  • Hansen JE, Lund O, Tolstrup N, Gooley AA, Williams KL, Brunak S (1998) NetOglyc: Prediction of mucin type 0-glycosylation sites based on sequence context and surface accessibility. Glycoconjugate J 15: 115–130

    Article  CAS  Google Scholar 

  • Hart GW, Greis KD, Dong LY, Blomberg MA, Chou TY, Jiang MS, Roquemore EP, Snow DM, Kreppel LK, Cole RN (1995) 0-linked N-acetylglucosamine: the “yin-yang” of Ser/Thr phosphorylation? Nuclear and cytoplasmic glycosylation. Adv Exp Med Biol 376: 115–123

    Google Scholar 

  • Heyer L, Kruglyak S, Yooseph S (1999) Exploring expression data identification and analysis of coexpressed genes. Genome Res 9: 1106–1115

    Article  PubMed  CAS  Google Scholar 

  • Hounsell EF, Davies MJ, Renouf DV (1996) 0-linked protein glycosylation structure and function. Glycoconjugate J 13: 19–26

    Google Scholar 

  • Huynen M, Dandekar T, Bork P (1998) Differential genome analysis applied to the species-specific features of Helicobacter pylori. FEBS Lett 426: 1–5

    Article  PubMed  CAS  Google Scholar 

  • Iliopoulos I, Tsoka S, Andrade MA, Janssen P, Audit B, Tramontano A, Valencia A, Leroy C, Sander C, Ouzounis CA (2000) Genome sequences and great expectations. Genome Biology 2: 1–2

    Article  Google Scholar 

  • Arabidopsis Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. The Arabidopsis Genome Initiative. Nature 408: 796–815

    Article  Google Scholar 

  • Krieg J, Hartmann S, Vicentini A, Glasner W, Hess D, Hofsteenge J (1998) Recognition signal for C-mannosylation of Trp-7 in RNase 2 consists of sequence Trp-x-x-Trp. Mol Biol Cell 9: 301–309

    PubMed  CAS  Google Scholar 

  • Kukuruzinska M, Lennon K (1998) Protein N-glycosylation: molecular genetics and functional significance. Crit Rev Oral Biol Med 9: 415–448

    Article  PubMed  CAS  Google Scholar 

  • Lis H, Sharon N (1993) Protein glycosylation: Structural and functional aspects. Cur J Biochem 218: 1–27

    Google Scholar 

  • Marcotte E (2000) Computational genetics: finding protein function by nonhomology methods. Curr Opin Struct Biol 10: 359–365

    Article  PubMed  CAS  Google Scholar 

  • Marcotte E, Pellegrini M, Ng H, Rice D, Yeates T, Eisenberg D (1999) Detecting protein function and protein-protein interactions from genome sequences. Science 285: 751–753

    Article  PubMed  CAS  Google Scholar 

  • Nielsen H, Krogh A (1998) Prediction of signal peptides and signal anchors by a hidden Markov model. In: Glasgow J, Littlejohn T, Major F, Lathrop R

    Google Scholar 

  • Sankoff D, Sensen C (eds) Proceedings, Sixth International Conference on Intelligent Systems for Molecular Biology, vol. 6. AAAI Press, Menlo Park, pp 122–130

    Google Scholar 

  • Nielsen H, Engelbrecht J, Brunak S, von Heijne G (1997) Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng 10: 1–6

    Article  PubMed  CAS  Google Scholar 

  • Nilsson I, von Heijne G (1993) Determination of the distance between the oligosaccharyl-transferase active site and the endoplasmic reticulum membrane. J Biol Chem 268: 5798–5801

    PubMed  CAS  Google Scholar 

  • Nilsson I, von Heijne G (2000) Glycosylation eficiency of Asn-Xaa-Thr sequons depends both on the distance from the C terminus and on the presence of a downstream transmembrane segment. J Biol Chem 275: 17338–17343

    Article  PubMed  CAS  Google Scholar 

  • Overbeek R, Fonstein M, D’Souza M, Pusch G, Maltsev N (1999) The use of gene clusters to infer functional coupling. Proc Natl Acad Sci USA 96: 2896–2901

    Article  PubMed  CAS  Google Scholar 

  • Pellegrini M, Marcotte E, Thompson M, Eisenberg D, Yeates T (1999) Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc Natl Acad Sci USA 96: 4285–4288

    Article  PubMed  CAS  Google Scholar 

  • Rechsteiner M, Rogers S (1996) PEST sequences and regulation by proteolysis. Trends Biochem Sci 21: 267–271

    PubMed  CAS  Google Scholar 

  • Riley M (1993) Functions of the gene products of Escherichia coli. Microbiol Rev 57: 862–952

    PubMed  CAS  Google Scholar 

  • Roth J, Wang Y, Eckhardt AE, Hill RL (1994) Subcellular localization of the UDP-N-acetyl-d-galactosamine: polypeptide Nacetylgalactosaminyltransferase-mediated O- glycosylation reaction in the submaxillary gland. Proc Nati Acad Sci USA 91: 8935–8939

    Article  CAS  Google Scholar 

  • Rubin G, Yandell M, Wortman J, Gabor Miklos G, Nelson C, Hariharan I, Fortini M, Li P, Apweiler R, Fleischmann W, Cherry J, Henikofi S, Skupski M, Misra S, Ashburner M, Birney E, Boguski M, Brody T, Brokstein P, Celniker S, Chervitz S, Coates D, Cravchik A, Gabrielian A, Galle R, Gelbart W, George R, Goldstein L, Gong F, Guan P, Harris N, Hay B, Hoskins R, Li J, Li Z, Hynes R, Jones S, Kuehl P, Lemaitre B, Littleton J, Morrison D, Mungall C, OFarrell P, Pickeral O, Shue C, Vosshall L, Zhang J, Zhao Q, Zheg X, Zhong F, Zhong W, Gibbs R, Venter J, Adams M, Lewis S (2000) Comparative genomics of the eukaryotes. Science 287: 2204–2215

    Article  PubMed  CAS  Google Scholar 

  • Snow DM, Hart GW (1998) Nuclear and Cytoplasmic Glycosylation. Int Rev Cytol 181: 43–74

    Article  PubMed  CAS  Google Scholar 

  • Sonnhammer E, von Heijne G, Krogh A (1998) A hidden Markov model for predicting transmembrane helices in protein sequences. Proc Int Conf Intell Syst Mol Biol 6: 175–182

    PubMed  CAS  Google Scholar 

  • Tamames J, Casari G, Ouzounis C, Valencia A (1997) Conserved clusters of functionally related genes in two bacterial genomes. J Mol Evol 44: 66–73

    Article  PubMed  CAS  Google Scholar 

  • Tatusov R, Koonin E, Lipman D (1997) A genomic perspective on protein families. Science 278: 631–637

    Article  PubMed  CAS  Google Scholar 

  • Van den Steen P, Rudd PM, Dwek RA, Opdenakker G (1998) Concepts and Principles of 0-linked Glycosylation. Crit Rev Biochem Mol Biol 33: 151–208

    Article  PubMed  Google Scholar 

  • Varki A (1993) Biological roles of oligosaccharides: all of the theories are correct. Glycobiology 3: 97–130

    Article  PubMed  CAS  Google Scholar 

  • Varshaysky A (1996) The N-end rule: functions, mysteries, uses. Proc Natl Acad Sci USA 93: 12142–12149

    Article  Google Scholar 

Download references

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gupta, R., Jensen, L.J., Brunak, S. (2002). Orphan Protein Function and Its Relation to Glycosylation. In: Mewes, HW., Seidel, H., Weiss, B. (eds) Bioinformatics and Genome Analysis. Ernst Schering Research Foundation Workshop, vol 38. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-04747-7_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-04747-7_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-04749-1

  • Online ISBN: 978-3-662-04747-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics