Skip to main content

Genomics and Proteomics Using Computational Biology

  • Chapter
  • First Online:
Book cover Bioinformatics Techniques for Drug Discovery

Part of the book series: SpringerBriefs in Computer Science ((BRIEFSCOMPUTER))

  • 868 Accesses

Abstract

Current functional genomics relies on known and characterised genes, but despite significant efforts in the field of genome annotation, accurate identification and elucidation of protein coding gene structures remains challenging. Methods are limited to computational predictions and transcript-level experimental evidence; hence translation cannot be verified. Proteomic mass spectrometry is a method that enables sequencing of gene product fragments, enabling the validation and refinement of existing gene annotation as well as the elucidation of novel protein coding regions. However, the application of proteomics data to genome annotation is hindered by the lack of suitable tools and methods to achieve automatic data processing and genome mapping at high accuracy and throughput.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. R. Aebersold, M. Mann, Mass spectrometry-based proteomics. Nature 422, 198–207 (2003)

    Article  Google Scholar 

  2. S.D. Patterson, R.H. Aebersold, Proteomics: the first decade and beyond. Nat. Genet. 33, 311–323 (2003)

    Article  Google Scholar 

  3. L.M. de Godoy, J.V. Olsen, G.A. de Souza, G. Li, P. Mortensen, M. Mann, Status of complete proteome analysis by mass spectrometry: SILAC labeled yeast as a model system. Genome Biol. 7, R50 (2006)

    Article  Google Scholar 

  4. A.L. McCormack, D.M. Schieltz, B. Goode, S. Yang, G. Barnes, D. Drubin, J.R. Yates, Direct analysis and identification of proteins in mixtures by LC/MS/MS and database searching at the low-femtomole level. Anal. Chem. 69, 767–776 (1997)

    Article  Google Scholar 

  5. A. Shevchenko, M. Wilm, O. Vorm, M. Mann, Mass spectrometric sequencing of proteins from silver-stained polyacrylamide gels. Anal. Chem. 68, 850–858 (1996)

    Article  Google Scholar 

  6. D.A. Wolters, M.P. Washburn, J.R. Yates, An automated multidimensional protein identification technology for shotgun proteomics. Anal. Chem. 73, 5683–5690 (2001)

    Article  Google Scholar 

  7. K. Biemann, Contributions of mass spectrometry to peptide and protein structure. Biol. Mass Spectrom. 16, 99–111 (1988)

    Article  Google Scholar 

  8. L.J. Foster, C.L. de Hoog, Y. Zhang, Y. Zhang, X. Xie, V.K. Mootha, M. Mann, A mammalian organelle map by protein correlation profiling. Cell 125, 187–199 (2006)

    Article  Google Scholar 

  9. R.J. Simpson, L.M. Connolly, J.S. Eddes, J.J. Pereira, R.L. Moritz, G.E. Reid, Proteomic analysis of the human colon carcinoma cell line (LIM 1215): development of a membrane protein database. Electrophoresis 21, 1707–1732 (2000)

    Article  Google Scholar 

  10. A.I. Nesvizhskii, R. Aebersold, Analysis, statistical validation and dissemination of large-scale proteomics datasets generated by tandem MS. Drug Discov. Today 9, 173–181 (2004)

    Article  Google Scholar 

  11. A.I. Nesvizhskii, A. Keller, E. Kolker, R. Aebersold, A statistical model for identifying proteins by tandem mass spectrometry. Anal. Chem. 75, 4646–4658 (2003)

    Article  Google Scholar 

  12. B.A. Parks, L. Jiang, P.M. Thomas, C.D. Wenger, M.J. Roth, M.T. Boyne, P.V. Burke, K.E. Kwast, N.L. Kelleher, Top-down proteomics on a chromatographic time scale using linear ion trap Fourier transform hybrid mass spectrometers. Anal. Chem. 79, 7984–7991 (2007)

    Article  Google Scholar 

  13. X. Han, M. Jin, K. Breuker, F.W. McLafferty, Extending top-down mass spectrometry to proteins with masses greater than 200 kilodaltons. Science 314, 109–112 (2006)

    Article  Google Scholar 

  14. M.J. Roth, B.A. Parks, J.T. Ferguson, M.T. Boyne, N.L. Kelleher, “Proteotyping”: population proteomics of human leukocytes using top down mass spectrometry. Anal. Chem. 80, 2857–2866 (2008)

    Article  Google Scholar 

  15. A.I. Nesvizhskii, O. Vitek, R. Aebersold, Analysis and validation of proteomic data generated by tandem mass spectrometry, Nat. Meth. 4 (2007)

    Article  Google Scholar 

  16. J.A. Taylor, R.S. Johnson, Sequence database searches via de novo peptide sequencing by tandem mass spectrometry. Rapid Commun. Mass Spectrom. 11, 1067–1075 (1997)

    Article  Google Scholar 

  17. M. Mann, M. Wilm, Error-tolerant identification of peptides in sequence databases by peptide sequence tags. Anal. Chem. 66, 4390–4399 (1994)

    Article  Google Scholar 

  18. E. Pitzer, A. Masselot, J. Colinge, Assessing peptide de novo sequencing algorithms performance on large and diverse data sets. Proteomics 7, 3051–3054 (2007)

    Article  Google Scholar 

  19. D.L. Tabb, A. Saraf, J.R. Yates, GutenTag: high-throughput sequence tagging via an empirically derived fragmentation model. Anal. Chem. 75, 6415–6421 (2003)

    Article  Google Scholar 

  20. A.M. Frank, M.M. Savitski, M.L. Nielsen, R.A. Zubarev, P.A. Pevzner, De novo peptide sequencing and identification with precision mass spectrometry. J. Proteome Res. 6, 114–123 (2007)

    Article  Google Scholar 

  21. S. Kim, N. Gupta, N. Bandeira, P.A. Pevzner, Spectral dictionaries integrating de novo peptide sequencing with database search of tandem mass spectra. Mol. Cell. Proteomics 8, 53–69 (2009)

    Article  Google Scholar 

  22. S. Tanner, H. Shu, A. Frank, L.-C. Wang, E. Zandi, M. Mumby, P.A. Pevzner, V. Bafna, InsPecT: identification of posttranslationally modified peptides from tandem mass spectra. Anal. Chem. 77, 4626–4639 (2005)

    Article  Google Scholar 

  23. J.V. Olsen, S.-E. Ong, M. Mann, Trypsin cleaves exclusively C-terminal to arginine and lysine residues. Mol. Cell. Proteomics 3, 608–614 (2004)

    Article  Google Scholar 

  24. J.K. Eng, A.L. McCormack, J.R. Yates, An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J. Am. Soc. Mass Spectrom. 5, 976–989 (1994)

    Article  Google Scholar 

  25. J.S. Cottrell, U. London, Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20, 3551–3567 (1999)

    Article  Google Scholar 

  26. P. Carella, D.C. Wilson, R.K. Cameron, Some things get better with age: differences in salicylic acid accumulation and defense signaling in young and mature Arabidopsis. Front. Plant Sci. 5 (2014)

    Google Scholar 

  27. R. Craig, R.C. Beavis, TANDEM: matching proteins with tandem mass spectra. Bioinformatics 20, 1466–1467 (2004)

    Article  Google Scholar 

  28. G.S. Omenn, T.W. Blackwell, D. Fermin, J. Eng, D.W. Speicher, S.M. Hanash, Challenges in deriving high-confidence protein identifications from data gathered by a HUPO plasma proteome collaborative study. Nat. Biotechnol. 24, 333–338 (2006)

    Article  Google Scholar 

  29. M. Clamp, B. Fry, M. Kamal, X. Xie, J. Cuff, M.F. Lin, M. Kellis, K. Lindblad-Toh, E.S. Lander, Distinguishing protein-coding and noncoding genes in the human genome. Proc. Natl. Acad. Sci. 104, 19428–19433 (2007)

    Article  Google Scholar 

  30. J.-M. Claverie, Fewer genes, more noncoding RNA. Science 309, 1529–1530 (2005)

    Article  Google Scholar 

  31. S. Washietl, J.S. Pedersen, J.O. Korbel, C. Stocsits, A.R. Gruber, J. Hackermüller, J. Hertel, M. Lindemeyer, K. Reiche, A. Tanzer, Structured RNAs in the ENCODE selected regions of the human genome. Genome Res. 17, 852–864 (2007)

    Article  Google Scholar 

  32. F.H. Crick, The biological replication of macromolecules. Symp. Soc. Exp. Biol, pp. 138–163 (1958)

    Google Scholar 

  33. F. Crick, Central dogma of molecular biology. Nature 227, 561–563 (1970)

    Article  Google Scholar 

  34. K. Liolios, I.-M.A. Chen, K. Mavromatis, N. Tavernarakis, P. Hugenholtz, V.M. Markowitz, N.C. Kyrpides, The genomes on line database (GOLD) in 2009: status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Res. 38, D346–D354 (2009)

    Article  Google Scholar 

  35. E. Pennisi, No genome left behind. Science 326, 794–795 (2009)

    Article  Google Scholar 

  36. E. Birney, J.A. Stamatoyannopoulos, A. Dutta, R. Guigó, T.R. Gingeras, E.H. Margulies, Z. Weng, M. Snyder, E.T. Dermitzakis, R.E. Thurman, Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447, 799–816 (2007)

    Article  Google Scholar 

  37. M.R. Brent, Genome annotation past, present, and future: how to define an ORF at each locus. Genome Res. 15, 1777–1786 (2005)

    Article  Google Scholar 

  38. L. Stein, Genome annotation: from sequence to biology. Nat. Rev. Genet. 2, 493–503 (2001)

    Article  Google Scholar 

  39. C.H. Wu, R. Apweiler, A. Bairoch, D.A. Natale, W.C. Barker, B. Boeckmann, S. Ferro, E. Gasteiger, H. Huang, R. Lopez, The Universal Protein Resource (UniProt): an expanding universe of protein information. Nucleic Acids Res. 34, D187–D191 (2006)

    Article  Google Scholar 

  40. J.R. Yates III, J.K. Eng, A.L. McCormack, Mining genomes: correlating tandem mass spectra of modified and unmodified peptides to sequences in nucleotide databases. Anal. Chem. 67, 3202–3210 (1995)

    Article  Google Scholar 

  41. J.S. Andersen, M. Mann, Mass spectrometry allows direct identification of proteins in large genomes. Proteomics 1 641g650 (2001)

    Google Scholar 

  42. J.S. Choudhary, W.P. Blackstock, D.M. Creasy, J.S. Cottrell, Matching peptide mass spectra to EST and genomic DNA databases. Trends Biotechnol. 19, 17–22 (2001)

    Article  Google Scholar 

  43. J.S. Choudhary, W.P. Blackstock, D.M. Creasy, J.S. Cottrell, Interrogating the human genome using uninterpreted mass spectrometry data. Proteomics 1, 651–667 (2001)

    Article  Google Scholar 

  44. F. Desiere, E.W. Deutsch, A.I. Nesvizhskii, P. Mallick, N.L. King, J.K. Eng, A. Aderem, R. Boyle, E. Brunner, S. Donohoe, Integration with the human genome of peptide sequences obtained by high-throughput mass spectrometry. Genome Biol. 6, R9 (2004)

    Article  Google Scholar 

  45. F. Desiere, E.W. Deutsch, N.L. King, A.I. Nesvizhskii, P. Mallick, J. Eng, S. Chen, J. Eddes, S.N. Loevenich, R. Aebersold, The peptideatlas project. Nucleic Acids Res. 34, D655–D658 (2006)

    Article  Google Scholar 

  46. S.F. Altschul, W. Gish, W. Miller, E.W. Myers, D.J. Lipman, Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aman Chandra Kaushik .

Rights and permissions

Reprints and permissions

Copyright information

© 2018 The Author(s)

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Kaushik, A.C., Kumar, A., Bharadwaj, S., Chaudhary, R., Sahi, S. (2018). Genomics and Proteomics Using Computational Biology. In: Bioinformatics Techniques for Drug Discovery. SpringerBriefs in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-319-75732-2_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-75732-2_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-75731-5

  • Online ISBN: 978-3-319-75732-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics