Skip to main content

Part of the book series: EXS ((EXS,volume 93))

  • 561 Accesses

Abstract

The writer T.S. Eliot once mused, “Where is the knowledge we have lost in information?” [1 ]. From a biological perspective, the answer to this profound question is today having far-reaching consequences for the future of biomedical research and, in particular, the drug discovery process.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. T.S. Eliot choruses from The Rock

    Google Scholar 

  2. Altschul SF, Madden TL, Schaffer AA et al (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25: 3389–3402

    Article  PubMed  CAS  Google Scholar 

  3. Pearson WR, Lipman DJ (1988) Improved tools for biological sequence comparison. Proc Natl Acad Sci USA 85: 2444–2448

    Article  PubMed  CAS  Google Scholar 

  4. Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48: 443–453

    Article  PubMed  CAS  Google Scholar 

  5. Sellers PH (1974) On the theory and computation of evolutionary distances. SIAM J Appl Math 26: 787–793

    Article  Google Scholar 

  6. Smith TF, Waterman MS (1981) Comparison of bio-sequences. Adv Appl Math 2: 482–489

    Article  Google Scholar 

  7. Dayhoff MO, Schwartz RM, Orcutt BC (1978) Atlas of protein sequence and structure. Nat Biomed Res Foundation, Washington D.C., USA 5, Suppl 3: 345–352

    Google Scholar 

  8. Jones DT, Taylor WR, Thornton JM (1992) A new approach to protein fold recognition. Nature 358: 86–89

    Article  PubMed  CAS  Google Scholar 

  9. Gonnet GH, Cohen MA, Benner SA (1992) Exhaustive matching of the entire protein-sequence database. Science 256: 1443–1445

    Article  PubMed  CAS  Google Scholar 

  10. Henikoff S, Henikoff JG (1993) Performance evaluation of amino acid substitution matrices. Proteins 17: 49–61

    Article  PubMed  CAS  Google Scholar 

  11. Zhang Z, Schaffer AA, Miller W et al (1998) Protein sequence similarity searches using patterns as seeds. Nucleic Acids Res 26: 3986–3990

    Article  PubMed  CAS  Google Scholar 

  12. Teichmann SA, Chothia C, Gerstein M (1999) Advances in structural genomics. Curr Opin Struct Biol 9: 390–399

    Article  PubMed  CAS  Google Scholar 

  13. Bairoch A (1991) PROSITE: a dictionary of sites and patterns in proteins. Nucleic Acids Res. 19 Suppl: 2241–2245

    Article  PubMed  CAS  Google Scholar 

  14. Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position specific gap penalties and weight matrix choice. Nucleic Acids Research 22: 4673–4680

    Article  PubMed  CAS  Google Scholar 

  15. Barton GJ (1994) The AMPS package for multiple protein sequence alignment. Methods Mol Biol 25: 327–347

    PubMed  CAS  Google Scholar 

  16. Gribskov M, McLachlan AD, Eisenberg D (1987) Profile analysis: Detection of distantly related proteins. Proc Natl Acad. Sci USA 84: 4355–4358

    Article  PubMed  CAS  Google Scholar 

  17. Hughey R, Krogh A (1996) Hidden Markov models for sequence analysis: extension and analysis of the basic method. Comput Appl Biosci 12: 95–107

    PubMed  CAS  Google Scholar 

  18. Neuwald AF, Liu JS, Lipman DJ et al (1997) Extracting protein alignment models from the sequence database. Nucleic Acids Res 25: 1665–1677

    Article  PubMed  CAS  Google Scholar 

  19. Grundy WN, Bailey TL, Elkan CP et al (1997) Meta-MEME: Motif-based hidden Markov models of protein families. Comput Applic Biosci 13: 397–406

    CAS  Google Scholar 

  20. Henikoff JG, Henikoff S, Pietrokovski S (1999) New features of the Blocks Database servers. Nucleic Acids Res 27: 226–228

    Article  PubMed  CAS  Google Scholar 

  21. Etzold T, Argos P (1993) SRS — an indexing and retrieval tool for flat file data libraries. Comput Appl Biosci 9: 49–57

    PubMed  CAS  Google Scholar 

  22. Online Mendelian Inheritance in Man, OMIM (TM). McKusick-Nathans Institute for Genetic Medicine, Johns Hopkins University (Baltimore, MD) and National Center for Biotechnology Information, National Library of Medicine (Bethesda, MD), 2000. World Wide Web URL: http://www.ncbi.nlm.nih.gov/omim/

  23. Discala C, Benigni X, Barillot E (2000) DBcat: a catalog of 500 biological databases. Nucleic Acids Res. 28: 8–9

    Article  PubMed  CAS  Google Scholar 

  24. Lawton JR, Martinez FA, Burks C (1989) Overview of the LiMB database. Nucleic Acids Res 17: 5885–5899

    Article  PubMed  CAS  Google Scholar 

  25. Bairoch A, Apweiler R (1999) The SWISS-PROT protein sequence data bank and its supplement TrEMBL. Nucleic Acids Res 27: 49–54

    Article  PubMed  CAS  Google Scholar 

  26. Barker WC, Garavelli JS, Huang H et al (2000) The protein information resource (PIR). Nucleic Acids Res 28: 41–44

    Article  PubMed  CAS  Google Scholar 

  27. Walsh S, Anderson M, Cartinhour SW (1998) ACEDB: a database for genome information. Methods Biochem Anal 39: 299–318

    CAS  Google Scholar 

  28. Hofmann K, Bucher P, Falquet L et al (1999) The PROSITE database, its status in 1999. Nucleic Acids Res 27: 215–219

    Article  PubMed  CAS  Google Scholar 

  29. Attwood TK, Flower DR, Lewis AP et al (1999) PRINTS prepares for the new millennium. Nucleic Acids Res 27: 220–225

    Article  PubMed  CAS  Google Scholar 

  30. Sonnhammer EL, Eddy SR, Bimey E et al (1998) Pfam: multiple sequence alignments and HMM-profiles of protein domains. Nucleic Acids Res 26: 320–322

    Article  PubMed  CAS  Google Scholar 

  31. Ponting CP, Schultz J, Milpetz F et al (1999) SMART: identification and annotation of domains from signalling and extracellular protein sequences. Nucleic Acids Res 27: 229–232

    Article  PubMed  CAS  Google Scholar 

  32. Laskowski RA (2001) PDBsum: summaries and analyses of PDB structures. Nucleic Acids Res 29: 221–222

    Article  PubMed  CAS  Google Scholar 

  33. Corpet F, Gouzy J, Kahn D (1998) The ProDom database of protein domain families. Nucleic Acids Res 26: 323–326

    Article  PubMed  CAS  Google Scholar 

  34. Gracy J, Argos P (1998) DOMO: a new database of aligned protein domains. Trends Biochem Sci 23: 495–497

    Article  PubMed  CAS  Google Scholar 

  35. Apweiler R, Attwood TK, Bairoch A et al (2000) InterPro-an integrated documentation resource for protein families, domains and functional sites. Bioinformatics 16: 1145–1150

    Article  PubMed  CAS  Google Scholar 

  36. Hubbard TJ, Ailey B, Brenner SE et al (1999) SCOP: a Structural Classification of Proteins database. Nucleic Acids Res 27: 254–256

    Article  PubMed  CAS  Google Scholar 

  37. Orengo CA, Pearl FM, Bray JE et al (1999) The CATH Database provides insights into protein struc-ture/function relationships. Nucleic Acids Res 27: 275–279

    Article  PubMed  CAS  Google Scholar 

  38. Tatusov RL, Koonin EV, Lipman DJ (1997) A genomic perspective on protein families. Science 24: 631–637

    Article  Google Scholar 

  39. Cooper DN, Ball EV, Krawczak M (1998) The human gene mutation database. Nucleic Acids Res 26: 285–287

    Article  PubMed  CAS  Google Scholar 

  40. Brookes AJ, Lehvaslaiho H, Siegfried M et al (2000) HGBASE: a database of SNPs and other variations in and around human genes. Nucleic Acids Res 28: 356–360

    Article  PubMed  CAS  Google Scholar 

  41. Sherry ST, Ward MH, Kholodov M et al (2001) dbSNP: the NCBI database of genetic variation. Nucleic Acids Res 29: 308–311

    Article  PubMed  CAS  Google Scholar 

  42. Borodovsky M, Mclninch J (1993) GeneMark: Parallel Gene Recognition for both DNA Strands. Computers & Chemistry 17: 123–133

    Article  CAS  Google Scholar 

  43. Xu Y, Einstein JR, Mural RJ et al (1994) An improved system for exon recognition and gene modeling in human DNA sequences. Proc Int Conf Intell Syst Mol Biol 2: 376–384

    PubMed  CAS  Google Scholar 

  44. Thomas A, Skolnick M (1994) A probabilistic model for detecting coding regions in DNA sequences. IMA J Math Appl Med Biol 11: 149–160

    Article  PubMed  CAS  Google Scholar 

  45. Cole ST, Brosch R, Parkhill J et al (1998) Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence. Nature 393: 537–544

    Article  PubMed  CAS  Google Scholar 

  46. Henderson J, Salzberg S, Fasman K (1997) Finding genes in DNA with a Hidden Markov Model. J Comput Biol 2: 127–141

    Article  Google Scholar 

  47. Lukashin AV, Borodovsky M (1998) GeneMark.hmm: new solutions for gene finding. Nucleic Acids Res 26: 1107–1115

    Article  PubMed  CAS  Google Scholar 

  48. Gelfand MS, Mironov AA, Pevzner PA (1996) Gene recognition via spliced sequence alignment. Proc Natl Acad Sci USA 93: 9061–9066

    Article  PubMed  CAS  Google Scholar 

  49. Quandt K, Frech K, Karas H et al (1995) Matlnd and Matlnspector: new fast and versatile tools for detection of consensus matches in nucleotide sequence data. Nucleic Acids Res 23: 4878–4884

    Article  PubMed  CAS  Google Scholar 

  50. Heinemeyer T, Chen X, Karas H et al (1999) Expanding the TRANSFAC database towards an expert system of regulatory molecular mechanisms. Nucleic Acids Res 27: 318–322

    Article  PubMed  CAS  Google Scholar 

  51. Parsons JD (1995) Improved tools for DNA comparison and clustering. Comput Appl Biosci 11: 603–613

    PubMed  CAS  Google Scholar 

  52. Pietu G, Eveno E, Soury-Segurens B (1999) The genexpress IMAGE knowledge base of the human muscle transcriptome: a resource of structural, functional, and positional candidate genes for muscle physiology and pathologies. Genome Res 9: 1313–1320

    Article  PubMed  CAS  Google Scholar 

  53. Williamson AR (1999) The Merck Gene Index project. Drug Discov Today 4: 115–122

    Article  PubMed  CAS  Google Scholar 

  54. Chou PY, Fasman GD (1974) Prediction of protein conformation. Biochemistry 13: 222–245

    Article  PubMed  CAS  Google Scholar 

  55. Gamier J, Osguthorpe DJ, Robson BJ (1978) Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. J Mol Biol 120: 97–120

    Article  Google Scholar 

  56. Rost B (1996) PHD: predicting one-dimensional protein structure by profile-based neural networks. Methods Enzymol 266: 525–539

    Article  PubMed  CAS  Google Scholar 

  57. Schneider R, Sander C (1996) The HSSP database of protein structure-sequence alignments. Nucleic Acids Res 24: 201–205

    Article  PubMed  CAS  Google Scholar 

  58. Salamov AA, Solovyev VV (1995) Prediction of protein secondary sturcture by combining nearest-neighbor algorithms and multiply sequence alignments. J Mol Biol 247: 11–15

    Article  PubMed  CAS  Google Scholar 

  59. King RD, Sternberg MJ (1996) Identification and application of the concepts important for accurate and reliable protein secondary structure prediction. Protein Sci 5: 2298–2310

    Article  PubMed  CAS  Google Scholar 

  60. Frishman D, Argos P (1995) Knowledge-based secondary structure assignment. Proteins 23: 566–579

    Article  PubMed  CAS  Google Scholar 

  61. Cuff JA, Clamp ME, Siddiqui AS et al (1998) JPred: a consensus secondary structure prediction server. Bioinformatics 14: 892–893

    Article  PubMed  CAS  Google Scholar 

  62. Sutcliffe MJ, Hayes FR, Blundell TL (1987) Knowledge based modelling of homologous proteins, Part II: Rules for the conformations of substituted sidechains. Protein Eng 1: 385–892

    Article  PubMed  CAS  Google Scholar 

  63. Sali A, Overington JP (1994) Derivation of rules for comparative protein modeling from a database of protein structure alignments. Protein Sci 3: 1582–1596

    Article  PubMed  CAS  Google Scholar 

  64. Jones DT, Tress M, Bryson K et al (1999) Successful recognition of protein folds using threading methods biased by sequence similarity and predicted secondary structure. Proteins 3: 104–111

    Article  PubMed  Google Scholar 

  65. Taylor WR (1997) Multiple sequence threading: an analysis of alignment quality and stability. J Mol Biol 269: 902–943

    Article  PubMed  CAS  Google Scholar 

  66. Rost B (1995) TOPITS: threading one-dimensional predictions into three-dimensional structures. Proc Int Conf Intell Syst Mol Biol 3: 314–321

    PubMed  CAS  Google Scholar 

  67. Russell RB, Copley RR, Barton GJ (1996) Protein fold recognition by mapping predicted secondary structures. J Mol Biol 259: 349–365

    Article  PubMed  CAS  Google Scholar 

  68. Rice DW, Eisenberg D (1997) A 3D–1D substitution matrix for protein fold recognition that includes predicted secondary structure of the sequence. J Mol Biol 267: 1026–1038

    Article  PubMed  CAS  Google Scholar 

  69. Aszodi A, Munro RE, Taylor WR (1997) Protein modeling by multiple sequence threading and distance geometry. Proteins Suppl 1: 38–42

    Article  Google Scholar 

  70. Laskowski RA, MacArthur MW, Moss DS et al (1993)PROCHECK: a program to check the stereo-chemical quality of protein structures. J Appl Cryst 26: 283–291

    Article  CAS  Google Scholar 

  71. Lupas A (1996) Prediction and Analysis of Coiled-Coil Structures. Methods Enzymol 266: 513–525

    Article  PubMed  CAS  Google Scholar 

  72. Berger B, Wilson DB, Wolf E et al (1995) “Predicting Coiled Coils by Use of Pairwise Residue Correlations”. Proc Nall Acad Sci USA 92: 8259–8263

    Article  CAS  Google Scholar 

  73. Lupas A (1997) Predicting coiled-coil regions in proteins. Curr Opin Struct Biol 7: 388–393

    Article  PubMed  CAS  Google Scholar 

  74. Hirst J, Vieth M, Skolnick J et al (1996) Predicting leucine zipper structures from sequence. Protein Eng 9: 657–662

    Article  PubMed  CAS  Google Scholar 

  75. Bornberg-Bauer E, Rivals E, Vingron M (1998) Computational approaches to identify leucine zippers. Nucleic Acids Res 26: 2740–2746

    Article  PubMed  CAS  Google Scholar 

  76. Claros MG, von Heijne G (1994) TopPred II: an improved software for membrane protein structure predictions. Comput Appl Biosci 10: 685–686

    PubMed  CAS  Google Scholar 

  77. Rost B, Fariselli P, Casadio R (1994) Refining neural network predictions for helical transmembrane proteins by dynamic programming. Comput Appl Biosci 10: 685–686

    Google Scholar 

  78. Persson B, Argos PJ (1997) Prediction of membrane protein topology utilizing multiple sequence alignments. Protein Chem 16: 453–457

    Article  CAS  Google Scholar 

  79. Jones DT, Taylor WR, Thornton JM (1994) A model recognition approach to the prediction of all-helical membrane protein structure and topology. Biochemistry 33: 3038–3049

    Article  PubMed  CAS  Google Scholar 

  80. Sonnhammer EL, von Heijne G, Krogh A (1998) A hidden Markov model for predicting transmembrane helices in protein sequences. Proc Int Conf Intell Syst Mol Biol 6: 175–182

    PubMed  CAS  Google Scholar 

  81. Cedano J, Aloy P, Perez-Pons JA et al (1997) Relation between amino acid composition and cellular location of proteins. J Mol Biol 266: 594–600

    Article  PubMed  CAS  Google Scholar 

  82. Nakai K, Horton P (1999) PSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization. Trends Biochem Sci 24: 34–36

    Article  PubMed  CAS  Google Scholar 

  83. Nielsen H, Brunak S, von Heijne G (1999) Machine learning approaches for the prediction of signal peptides and other protein sorting signals. Protein Eng 12: 3–9

    Article  PubMed  CAS  Google Scholar 

  84. Felsenstein J (1989) PHYLIP—Phylogeny Inference Package (Version 3.2). Cladistics 5: 164–166; also see http://evolution.genetics.washington.edu/phylip.htmll

    Google Scholar 

  85. Wills C (1994) Phylogenetic analysis and molecular evolution. In: DW Smith (ed): Biocomputing: Informatics and Genome Projects. Academic Press, San Diego, 175–201

    Google Scholar 

  86. Setubal J, Meidanis J (eds) (1996) Introduction to Computational Molecular Biology. PWS Publishing Co., Boston

    Google Scholar 

  87. Huson DH, Vawter L, Warnow TJ (1999) Solving large scale phylogenetic problems using DCM2. In: Lengauer T, Schneider R (eds): Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology. AAAI Press, Menlo Park (CA), 118–129

    Google Scholar 

  88. Strimmer K, von Haeseler A (1997) Likelihood-mapping: a simple method to visualize phylogenetic content of a sequence alignment. Proc Natl Acad Sci USA 94: 6815–6819

    Article  PubMed  CAS  Google Scholar 

  89. Diaconis PW, Holmes SP (1998) Matchings and phylogenetic trees. Proc Natl Acad Sci USA 95: 14600–14602

    Article  PubMed  CAS  Google Scholar 

  90. Karp PD, Riley M, Paley SM et al (1996) EcoCyc: an encyclopedia of Escherichia coli genes and metabolism. Nucleic Acids Res 24: 32–39; see also http://ecocyc.pangeasystems.com/ecocyc/

    Article  PubMed  CAS  Google Scholar 

  91. Bork P, Dandekar T, Diaz-Lazcoz Y et al (1998) Predicting function: from genes to genomes and back. J Mol Biol 283: 707–725

    Article  PubMed  CAS  Google Scholar 

  92. Ehlde M, Zacchi G (1995) MIST: a user-friendly metabolic simulator. Comput Appl Biosci 11: 201–207

    PubMed  CAS  Google Scholar 

  93. Mendes P. (1993) GEPASI: a software package for modelling the dynamics, steady states and control of biochemical and other systems. Comput Appl Biosci 9: 563–571

    PubMed  CAS  Google Scholar 

  94. Tomita M, Hashimoto K, Takahashi K et al (1999) E-CELL: Software environment for whole cell simulation. Bioinformatics 15: 72–84; also see E-Cell Project http://www.e-cell.org/

    Article  PubMed  CAS  Google Scholar 

  95. Heidtke KR, Schulze-Kremer S (1998) BioSim – a new qualitative simulation environment for molecular biology. In: J Glasgow, T Littlejohn, F Major, R Lathrop, D Sankoff, C Sensen (eds) (: Proceedings of Sixth International Conference on Intelligent Systems for Molecular Biology. AAAI Press, Menlo Park (CA), 85–94

    Google Scholar 

  96. D’haeseleer P, Liang S, Somogyi R (1999) Gene expression data analysis and modeling. Tutorial session at Pacific Symposium on Biocomputing, Hawaii, January: 4–9; also see http://www.cgl.ucsf.edu/psb/psb99/genetutorial.pdf

    Google Scholar 

  97. McAdams HH, Shapiro L (1995) Circuit simulation of genetic networks. Science 269: 650–656

    Article  PubMed  CAS  Google Scholar 

  98. Kanehisa M, Goto S (2000) KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28: 27–30

    Article  PubMed  CAS  Google Scholar 

  99. Scharf M, Schneider R, Casari G et al (1994) GeneQuiz: a workbench for sequence analysis. Proc Int Conf Intel! Syst Mol Biol 2: 348–353

    CAS  Google Scholar 

  100. Frishman D, Mewes H-W (1997) PEDANTic genome analysis. Trends in Genetics 13: 415–416

    Article  CAS  Google Scholar 

  101. Gaasterland T, Sensen CW (1996) MAGPIE: automated genome interpretation. Trends Genet 12: 76–78

    Article  PubMed  CAS  Google Scholar 

  102. Kabsch W, Sander C (1983) How good are predictions of protein secondary structure? FEBS Lett 155: 179–182

    Article  PubMed  CAS  Google Scholar 

  103. Levin JM, Pascarella S, Argos P et al (1993) Quantification of secondary structure prediction improvement using multiple alignment. Protein Eng 6: 849–854

    Article  PubMed  CAS  Google Scholar 

  104. Rost B, Sander C (1994) Combining evolutionary information and neural networks to predict protein secondary structure. Proteins 19: 55–72

    Article  PubMed  CAS  Google Scholar 

  105. Lim VI (1974) Structural Principles of the Globular Organization of Protein Chains. A Stereochemical Theory of Globular Protein Secondary Structure. J Mol Biol 88: 857–872

    Article  PubMed  CAS  Google Scholar 

  106. Schneider R (1989) Sekundarstrukturvorhersage von Proteinen unter Berticksichtigung von Tertiarstrukturaspekten. Diploma thesis, Universitat Heidelberg, Germany

    Google Scholar 

  107. Ptitsyn OB, Finkelstein AV (1983) Theory of protein secondary structure and algorithm of its prediction. Biopolymers 22: 15–25

    Article  PubMed  CAS  Google Scholar 

  108. Gibrat J-F, Gamier J, Robson B (1987) Further developments of protein secondary structure prediction using information theory. New parameters and consideration of residue pairs. J Mol Biol 198: 425–443

    Article  PubMed  CAS  Google Scholar 

  109. Gamier J, Gibrat J-F, Robson B (1996) GOR method for predicting protein secondary structure from amino acid sequence. Meth Enzymol 266: 540–553

    Article  Google Scholar 

  110. Kabsch WSander C (1983) Segment83. Unpublished

    Google Scholar 

  111. Brenner SE, Barken D, Levitt M (1999) The PRESAGE database for structural genomics. Nucleic Acids Res 27: 251–253

    Article  PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer Basel AG

About this chapter

Cite this chapter

Jackson, D.B., Minch, E., Munro, R.E. (2003). Bioinformatics. In: Hillisch, A., Hilgenfeld, R. (eds) Modern Methods of Drug Discovery. EXS, vol 93. Birkhäuser, Basel. https://doi.org/10.1007/978-3-0348-7997-2_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-0348-7997-2_3

  • Publisher Name: Birkhäuser, Basel

  • Print ISBN: 978-3-0348-9397-8

  • Online ISBN: 978-3-0348-7997-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics