New Challenges and Strategies for Multiple Sequence Alignment in the Proteomics Era

  • Julie D. Thompson
  • Olivier Poch
Part of the Springer Protocols Handbooks book series (SPH)


The postgenomic era is presenting new challenges for bioinformatics. High-through- put genome sequencing and assembly techniques, together with new information resources, such as structural proteomics, transcriptome data from microarray analyses, or light microscopy images of living cells, have led to a rapid increase in the amount of data available, ranging from complete genome sequences to cellular, structure, phenotype, and other types of biologically relevant information. Thus, genomic and proteomic research has transformed molecular biology from a “ data poor” to a “ data rich” science. The question now is whether or not we can make sense of all these data using bioinformatics approaches in such areas as genome annotation, comparative genomics, and gene expression analysis. In the face of this ever-increasing volume of complex and constantly evolving data, the integration of experimental data with bioinformatic comparative and predictive analyses will be crucial to the complete description of protein function, not only at the molecular level but also at the higher levels of the pathways, macro-molecular complexes, cells, or organs a protein belongs to.


Multiple Sequence Alignment Multiple Alignment Dynamic Programming Algorithm Pairwise Alignment Global Alignment 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Bertone, P., Kluger, Y., Lan, N., et al. (2001) SPINE: an integrated tracking database and data mining approach for identifying feasible targets in high-throughput structural proteomics. Nucleic Acids Res. 29, 2884–2898.PubMedCrossRefGoogle Scholar
  2. 2.
    Oyama, T., Kitano, K., Satou, K., and Ito, T. (2002) Extraction of knowledge on protein-protein interaction by association rule discovery. Bioinformatics 18, 705–714.PubMedCrossRefGoogle Scholar
  3. 3.
    Creighton, C. and Hanash, S. (2003) Mining gene expression databases for association rules. Bioinformatics 19, 79–86.PubMedCrossRefGoogle Scholar
  4. 4.
    Oldfield, T. J. (2002) Data mining the Protein Data Bank: residue interactions. Proteins 49, 510–529.PubMedCrossRefGoogle Scholar
  5. 5.
    Shannon, W., Culverhouse, R., and Duncan, J. (2003) Analyzing microarray data using cluster analysis. Pharmacogenomics 4, 41–52.PubMedCrossRefGoogle Scholar
  6. 6.
    Wiechert, W., Joksch, B., Wittig, R., Hartbrich, A., Honer, T., and Mollney, M. (1995) Object-oriented programming for the biosciences. Comput. Appl. Biosci. 11, 517–534.PubMedGoogle Scholar
  7. 7.
    Achard, F., Vaysseix, G., and Barillot, E. (2001) XML, bioinformatics and data integration. Bioinformatics 17, 115–125.PubMedCrossRefGoogle Scholar
  8. 8.
    Achard, F. and Barillot, E. (1997) Ubiquitous distributed objects with CORBA. Pac. Symp. Biocomput. 39–50.Google Scholar
  9. 9.
    Campagne, F. (2000) Clustalnet: the joining of Clustal and CORBA. Bioinformatics 16, 606–612.PubMedCrossRefGoogle Scholar
  10. 10.
    Wang, L., Rodriguez-Tome, P., Redaschi, N., McNeil, P., Robinson, A., and Lijnzaad, P. (2000) Accessing and distributing EMBL data using CORBA (common object request broker architecture). Genome Biol. 1, RESEARCH0010.Google Scholar
  11. 11.
    Foster, I. (2003) The grid: computing without bounds. Sci. Am. 288, 78–85.PubMedCrossRefGoogle Scholar
  12. 12.
    del Sol Mesa, A., Pazos, F., and Valencia, A. (2003) Automatic methods for predicting functionally important residues. J. Mol. Biol. 326, 1289–1302.PubMedCrossRefGoogle Scholar
  13. 13.
    Phillips, A., Janies, D., and Wheeler, W. (2000) Multiple sequence alignment in phyloge-netic analysis. Mol. Phylogenet. Evol. 16, 317–330.PubMedCrossRefGoogle Scholar
  14. 14.
    Morgenstern, B. (2000) A space-efficient algorithm for aligning large genomic sequences. Bioinformatics 16, 948–949.PubMedCrossRefGoogle Scholar
  15. 15.
    Hohl, M., Kurtz, S., and Ohlebusch, E. (2002) Efficient multiple genome alignment. Bioinformatics 18, S312–S320.PubMedCrossRefGoogle Scholar
  16. 16.
    Brudno, M., Do, C.B., Cooper, G. M., et al. (2003) LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA. Genome Res. 13, 721–731.PubMedCrossRefGoogle Scholar
  17. 17.
    Mathe, C., Sagot, M. F., Schiex, T., and Rouze, P. (2002) Current methods of gene predic-tion, their strengths and weaknesses. Nucleic Acids Res. 30, 4103–4117.PubMedCrossRefGoogle Scholar
  18. 18.
    Aggarwal, G. and Ramaswamy, R. (2002) Ab initio gene identification: prokaryote genome annotation with GeneScan and GLIMMER. J. Biosci. 27(Suppl 1), 7–14.PubMedCrossRefGoogle Scholar
  19. 19.
    Zhang, M. Q. (2002) Computational prediction of eukaryotic protein-coding genes. Nat. Rev. Genet. 3, 698–709.PubMedCrossRefGoogle Scholar
  20. 20.
    Dandekar, T., Huynen, M., Regula, J. T., et al. (2000) Re-annotating the mycoplasma pneumoniae genome sequence: adding value, function and reading frames. Nucleic Acids Res. 28, 3278–3288.PubMedCrossRefGoogle Scholar
  21. 21.
    Lecompte, O., Ripp, R., Thierry, J. C., Moras, D., and Poch, O. (2002) Comparative analy-sis of ribosomal proteins in complete genomes: an example of reductive evolution at the domain scale. Nucleic Acids Res. 30, 5382–5390.PubMedCrossRefGoogle Scholar
  22. 22.
    Burke, J., Wang, H., Hide, W., and Davison, D. B. (1998) Alternative gene form discov-ery and candidate gene selection from gene indexing projects. Genome Res. 8, 276–290.PubMedGoogle Scholar
  23. 23.
    Ji, H., Zhou, Q., Wen, F., Xia, H., Lu, X., and Li, Y. (2001) AsMamDB: an alternative splice database of mammals. Nucleic Acids Res. 29, 260–263.PubMedCrossRefGoogle Scholar
  24. 24.
    Altschul, S. F., Madden, T. L., Schaffer, A. A., et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402.PubMedCrossRefGoogle Scholar
  25. 25.
    Eddy, S. R. (1998) Profile hidden Markov models. Bioinformatics 14, 755–763.PubMedCrossRefGoogle Scholar
  26. 26.
    Karplus, K., Barrett, C., and Hughey, R. (1999) Hidden Markov models for detecting remote protein homologies. Bioinformatics 14, 846–856.CrossRefGoogle Scholar
  27. 27.
    Yona, G. and Levitt, M. (2002) Within the twilight zone: a sensitive profile-profile com-parison tool based on information theory. J. Mol. Biol. 315, 1257–1275.PubMedCrossRefGoogle Scholar
  28. 28.
    Gaasterland, T. and Sensen, C. W. (1996) Fully automated genome analysis that reflects user needs and preferences. A detailed introduction to the MAGPIE system architecture. Biochimie 78, 302–310.PubMedCrossRefGoogle Scholar
  29. 29.
    Medigue, C., Rechenmann, F., Danchin, A., and Viari, A. (1999) Imagene: an integrated computer environment for sequence annotation and analysis. Bioinformatics 15, 2–15.PubMedCrossRefGoogle Scholar
  30. 30.
    Hoersch, S., Leroy, C., Brown, N. P., Andrade, M. A., and Sander, C. (2000) The GeneQuiz web server: protein functional analysis through the Web. Trends Biochem. Sci. 25, 33–35.PubMedCrossRefGoogle Scholar
  31. 31.
    Jareborg, N. and Durbin, R. (2000) Alfresco-A workbench for comparative genomic sequence analysis. Genome Res. 10, 1148–1157.PubMedCrossRefGoogle Scholar
  32. 32.
    Mulder, N. J., Apweiler, R., Attwood, T. K., et al. (2003) The InterPro Database, 2003 brings increased coverage and new features. Nucleic Acids Res. 31, 315–318.PubMedCrossRefGoogle Scholar
  33. 33.
    Bejerano, G., Seldin, Y., Margalit, H., and Tishby, N. (2001) Markovian domain finger-printing: statistical segmentation of protein sequences. Bioinformatics 17, 927–934.PubMedCrossRefGoogle Scholar
  34. 34.
    George, R. A. and Heringa, J. (2002) SnapDRAGON: a method to delineate protein struc-tural domains from sequence data. J. Mol. Biol. 316, 839–851.PubMedCrossRefGoogle Scholar
  35. 35.
    Heringa, J. (2000) Computational methods for protein secondary structure prediction using multiple sequence alignments. Curr. Protein Pept. Sci. 1, 273–301.PubMedCrossRefGoogle Scholar
  36. 36.
    Al-Lazikani, B., Jung, J., Xiang, Z., and Honig, B. (2001) Protein structure prediction. Curr. Opin. Chem. Biol. 5, 51–56.PubMedCrossRefGoogle Scholar
  37. 37.
    Chen, C. P., Kernytsky, A., and Rost, B. (2002) Transmembrane helix predictions revis-ited. Protein Sci. 11, 2774–2791.PubMedCrossRefGoogle Scholar
  38. 38.
    Lichtarge, O., Bourne, H. R., and Cohen, F. E. (1996) An evolutionary trace method defines binding surfaces common to protein families. J. Mol. Biol. 257, 342–358.PubMedCrossRefGoogle Scholar
  39. 39.
    Lockless, S. W. and Ranganathan, R. (1999) Evolutionarily conserved pathways of ener-getic connectivity in protein families. Science 286, 295–299.PubMedCrossRefGoogle Scholar
  40. 40.
    Ito, T., Ota, K., Kubota, H., et al. (2002) Roles for the two-hybrid system in exploration of the yeast protein interactome. Mol. Cell Proteomics 8, 561–566.Google Scholar
  41. 41.
    Gavin, A. C., Bosche, M., Krause, R., et al. (2002) Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415, 141–147.PubMedCrossRefGoogle Scholar
  42. 42.
    Bonneau, R., Strauss, C. E., and Baker, D. (2001) Improving the performance of Rosetta using multiple sequence alignment information and global measures of hydrophobic core formation. Proteins 43, 1–11.PubMedCrossRefGoogle Scholar
  43. 43.
    Pazos, F. and Valencia, A. (2001) Similarity of phylogenetic trees as indicator of protein-protein interaction. Protein Eng. 14, 609–614.PubMedCrossRefGoogle Scholar
  44. 44.
    Pazos, F., Helmer-Citterich, M., Ausiello, G., and Valencia A. (1997) Correlated muta-tions contain information about protein-protein interaction. J. Mol. Biol. 271, 511–523.PubMedCrossRefGoogle Scholar
  45. 45.
    Pazos, F. and Valencia, A. (2002) In silico two-hybrid system for the selection of physi-cally interacting protein pairs. Proteins 47, 219–227.PubMedCrossRefGoogle Scholar
  46. 46.
    Needleman, S. B. and Wunsch, C. D. (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443–453.PubMedCrossRefGoogle Scholar
  47. 47.
    Smith, T. F. and Waterman, M. S. (1981) Identification of common molecular subse-quences. J. Mol. Biol. 147, 195–197.PubMedCrossRefGoogle Scholar
  48. 48.
    Sankoff, D. (1975) Minimal mutation trees of sequences. SIAM J. Appl. Math. 78, 35–42.CrossRefGoogle Scholar
  49. 49.
    Gupta, S. K., Kececioglu, J. D., and Schaffer, A. A. (1995) Improving the time efficiency of the shortest-paths approach to sum-of-pairs multiple sequence alignment. J. Comput. Biol. 2, 459–472.PubMedCrossRefGoogle Scholar
  50. 50.
    Feng, D. F. and Doolittle, R. F. (1987) Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J. Mol. Evol. 25, 351–360.PubMedCrossRefGoogle Scholar
  51. 51.
    Thompson, J. D., Higgins, D. G., and Gibson, T. J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, posi-tion-specific gap penalties and matrix choice. Nucleic Acids Res. 22, 4673–4680PubMedCrossRefGoogle Scholar
  52. 52.
    Thompson, J. D., Gibson, T. J., Plewniak, F., Jeanmougin, F., and Higgins, D. G. (1997) The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 22, 4673–4680.CrossRefGoogle Scholar
  53. 53.
    Smith, R. F. and Smith, T. F. (1992) Pattern-induced multi-sequence alignment (PIMA) algorithm employing secondary structure-dependent gap penalties for use in comparative protein modelling. Protein Eng. 5, 35–41.PubMedCrossRefGoogle Scholar
  54. 54.
    Eddy, S. R. (1995) Multiple alignment using hidden Markov models. ISMB 3, 114–120.PubMedGoogle Scholar
  55. 55.
    Karplus, K., Barrett, C., and Hughey, R. (1998) Hidden Markov models for detecting remote protein homologies. Bioinformatics 10, 846–856.CrossRefGoogle Scholar
  56. 56.
    Notredame, C. and Higgins, D. G. (1996) SAGA: sequence alignment by genetic algo-rithm. Nucleic Acids Res. 24, 1515–1524.PubMedCrossRefGoogle Scholar
  57. 57.
    Morgenstein, B., Dress, A., and Werner, T. (1996) Multiple DNA and protein sequence alignment based on segment-to-segment comparison. Proc. Natl. Acad. Sci. USA 93, 12098–12103.CrossRefGoogle Scholar
  58. 58.
    Lawrence, C. E., Altschul, S. F., Boguski, M. S., Liu, J. S., Neuwald, A. F., and Wootton, J. C. (1993) Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science 262, 208–214.PubMedCrossRefGoogle Scholar
  59. 59.
    Gotoh, O. (1996) Significant improvement in accuracy of multiple protein sequence align-ments by iterative refinement as assessed by reference to structural alignments. J. Mol. Biol. 264, 823–838.PubMedCrossRefGoogle Scholar
  60. 60.
    Stoye, J. (1998) Multiple sequence alignment with the Divide-and-Conquer method. Gene 211, GC45–56.PubMedCrossRefGoogle Scholar
  61. 61.
    Taylor, W. R., Saelensminde, G., and Eidhammer, I. (2000) Multiple protein sequence alignment using double-dynamic programming. Comput. Chem. 1, 3–12.CrossRefGoogle Scholar
  62. 62.
    Thompson, J. D., Plewniak, F., and Poch, O. (1999) A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Res. 27, 2683–2690.CrossRefGoogle Scholar
  63. 63.
    Thompson, J. D., Plewniak. F., and Poch, O. (1999) BaliBASE: A benchmark alignment database for the evaluation of multiple sequence alignment programs. Bioinformatics 1, 87–88.CrossRefGoogle Scholar
  64. 64.
    Brocchieri, L. and Karlin, S. (1998) A symmetric-iterated multiple alignment of protein sequences. J Mol Biol 276, 249–264.PubMedCrossRefGoogle Scholar
  65. 65.
    Bucka-Lassen, K., Caprani, O., and Hein, J. (1999) Combining many multiple alignments in one improved alignment. Bioinformatics 15, 122–130.PubMedCrossRefGoogle Scholar
  66. 66.
    Notredame, C., Higgins, D. G., and Heringa, J. (2000) T-Coffee: a novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 302, 205–217.PubMedCrossRefGoogle Scholar
  67. 67.
    Thompson, J. D., Plewniak, F., Thierry, J. C., and Poch, O. (2000) DbClustal: rapid and reliable global multiple alignments of protein sequences detected by database searches. Nucleic Acids Res. 28, 2919–2926.PubMedCrossRefGoogle Scholar
  68. 68.
    Notredame, C., Holm, L., and Higgins, D. G. (1998) COFFEE: an objective function for multiple sequence alignments. Bioinformatics 14, 407–422.PubMedCrossRefGoogle Scholar
  69. 69.
    Hertz, G. Z. and Stormo, G. D. (1999) Identifying DNA and protein patterns with statisti-cally significant alignments of multiple sequences. Bioinformatics 15, 563–577.PubMedCrossRefGoogle Scholar
  70. 70.
    Gonnet, G. H., Korostensky, C., and Benner, S. (2000) Evaluation measures of multiple sequence alignments. J. Comput. Biol. 7, 261–276.PubMedCrossRefGoogle Scholar
  71. 71.
    Thompson, J. D., Plewniak, F., Ripp, R., Thierry, J. C., and Poch, O. (2001) Towards a reliable objective function for multiple sequence alignments. J. Mol. Biol. 314, 937–951.PubMedCrossRefGoogle Scholar
  72. 72.
    Pei, J. and Grishin, N. V. (2001) AL2CO: calculation of positional conservation in a pro-tein sequence alignment. Bioinformatics 17, 700–712.PubMedCrossRefGoogle Scholar
  73. 73.
    Cline, M., Hughey, R., and Karplus, K. (2002) Predicting reliable regions in protein sequence alignments. Bioinformatics 18, 306–314.PubMedCrossRefGoogle Scholar
  74. 74.
    Fares, M. A., Elena, S. F., Ortiz, J., Moya, A., and Barrio, E. (2002) A sliding window-based method to detect selective constraints in protein-coding genes and its application to RNA viruses. J. Mol. Evol. 55, 509–521.PubMedCrossRefGoogle Scholar
  75. 75.
    Schlosshauer, M. and Ohlsson, M. (2002) A novel approach to local reliability of sequence alignments. Bioinformatics 18, 847–854.PubMedCrossRefGoogle Scholar
  76. 76.
    Manning, G, Whyte, D. B., Martinez, R., Hunter, T., and Sudarsanam S. (2002) The protein kinase complement of the human genome. Science 298, 1912–1934.PubMedCrossRefGoogle Scholar
  77. 77.
    Wang, L. and Xu, Y. (2003) SEGID: identifying interesting segments in (multiple) sequence alignments. Bioinformatics 19, 297–298.PubMedCrossRefGoogle Scholar
  78. 78.
    Holm, L. and Sander, C. (1998) Touring protein fold space with Dali/FSSP. Nucleic Acids Res. 26, 316–319.PubMedCrossRefGoogle Scholar
  79. 79.
    Trelles, O., Andrade, M. A., Valencia, A., Zapata, E. L., and Carazo, J. M. (1998) Com-putational space reduction and parallelization of a new clustering approach for large groups of sequences. Bioinformatics 14, 439–451.PubMedCrossRefGoogle Scholar
  80. 80.
    Kawaji, H., Yamaguchi, Y., Matsuda, H., and Hashimoto, A. (2001) A graph-based clus-tering method for a large set of sequences using a graph partitioning algorithm. Genome Inform. Ser. Workshop Genome Inform. 12, 93–102.Google Scholar
  81. 81.
    Enright, A. J., Van Dongen, S., and Ouzounis, C. A. (2002) An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 30, 1575–1584.PubMedCrossRefGoogle Scholar
  82. 82.
    May, A. C. (2001) Optimal classification of protein sequences and selection of represen-tative sets from multiple alignments: application to homologous families and lessons for structural genomics. Protein Eng. 14, 209–217.PubMedCrossRefGoogle Scholar
  83. 83.
    Keith, J. M., Adams, P., Bryant, D., et al. (2002) A simulated annealing algorithm for finding consensus sequences. Bioinformatics 18, 1494–1499.PubMedCrossRefGoogle Scholar
  84. 84.
    Rogic, S., Mackworth, A. K., and Ouellette F. B. (2001) Evaluation of gene-finding pro-grams on mammalian sequences. Genome Res. 11, 817–832.PubMedCrossRefGoogle Scholar
  85. 85.
    Marcotte, E. M., Pellegrini, M., Yeates, T. O., and Eisenberg, D. (1999) A census of protein repeats. J. Mol. Biol. 293, 151–160.PubMedCrossRefGoogle Scholar
  86. 86.
    Huntley, M. and Golding, G. B. (2000) Evolution of simple sequence in proteins. J. Mol. Evol. 51, 131–140.PubMedGoogle Scholar
  87. 87.
    Wallin, E. and von Heijne, G. (1998) Genome-wide analysis of integral membrane pro-teins from eubacterial, archaean, and eukaryotic organisms. Protein Sci. 7, 1029–1038.PubMedCrossRefGoogle Scholar
  88. 88.
    Devos, D. and Valencia, A. (2001) Intrinsic errors in genome annotation. Trends Genet. 17, 429–431.PubMedCrossRefGoogle Scholar
  89. 89.
    Katoh, K., Misawa, K., Kuma, K., and Miyata, T. (2002) MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30, 3059–3066.PubMedCrossRefGoogle Scholar
  90. 90.
    Heringa, J. (1999) Two strategies for sequence comparison: profile-preprocessed and secondary structure-induced multiple alignment. Comput. Chem. 23, 341–364.PubMedCrossRefGoogle Scholar
  91. 91.
    Jennings, A. J., Edge, C. M., and Sternberg, M. J. (2001) An approach to improving multiple alignments of protein sequences using predicted secondary structure. Protein Eng. 14, 227–231.PubMedCrossRefGoogle Scholar
  92. 92.
    Shi, J., Blundell, T. L., and Mizuguchi, K. (2001) FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. J. Mol. Biol. 310, 243–257.PubMedCrossRefGoogle Scholar
  93. 93.
    Plewniak, F., Thompson, J. D., and Poch, O. (2000) Ballast: blast post-processing based on locally conserved segments. Bioinformatics 16, 750–759.PubMedCrossRefGoogle Scholar
  94. 94.
    Cohen, G. N., Barbe, V., Flament, D., et al. (2003) An integrated analysis of the genome of the hyperthermophilic archaeon Pyrococcus abyssi. Mol. Microbiol. 47, 1495–1512.CrossRefGoogle Scholar
  95. 95.
    Thompson, J. D., Thierry, J. C., and Poch, O. (2003) RASCAL: rapid scanning and co-rection of multiple sequence alignments. Bioinformatics 19(9), 1155–1161.PubMedCrossRefGoogle Scholar
  96. 96.
    Wicker, N., Perrin, G. R., Thierry, J. C., and Poch, O. (2001) Secator: a program for inferring protein subfamilies from phylogenetic trees. Mol. Biol. Evol. 18, 1435–1441.PubMedGoogle Scholar
  97. 97.
    Wicker, N., Dembele, D., Raffelsberger, W., and Poch, O. (2002) Density of points clus-tering, application to transcriptomic data analysis. Nucleic Acids Res. 30, 3992–4000.PubMedCrossRefGoogle Scholar
  98. 98.
    Lecompte, O., Thompson, J. D., Plewniak, F., Thierry, J. C., and Poch, O. (2001) Multiple alignment of complete sequences (MACS) in the post-genomic era. Gene 270, 17–30.PubMedCrossRefGoogle Scholar
  99. 99.
    May, A. C. (2002) Definition of the tempo of sequence diversity across an alignment and automatic identification of sequence motifs: application to protein homologous families and superfamilies. Protein Sci. 11, 2825–2835.PubMedCrossRefGoogle Scholar
  100. 100.
    Kunin, V., Chan, B., Sitbon, E., Lithwick, G., and Pietrokovski, S. (2001) Consistency analysis of similarity between multiple alignments: prediction of protein function and fold structure from analysis of local sequence motifs. J. Mol. Biol. 307, 939–949.PubMedCrossRefGoogle Scholar
  101. 101.
    Mirny, L. A. and Shakhnovich, E. I. (1999) Universally conserved positions in protein folds: reading evolutionary signals about stability, folding kinetics and function. J. Mol. Biol. 291, 177–196.PubMedCrossRefGoogle Scholar
  102. 102.
    Ota, M., Kinoshita, K., and Nishikawa, K. (2003) Prediction of catalytic residues in enzymes based on known tertiary structure, stability profile, and sequence conservation. J. Mol. Biol. 327, 1053–1064.PubMedCrossRefGoogle Scholar
  103. 103.
    Shatkin, A. J. and Manley, J. L. (2000) The ends of the affair: capping and polyadenyla-tion. Nat. Struct. Biol. 7, 838–842.PubMedCrossRefGoogle Scholar
  104. 104.
    Errami, M., Geourjon, C., and Deleage, G. (2003) Detection of unrelated proteins in sequences multiple alignments by using predicted secondary structures. Bioinformatics 19, 506–512.PubMedCrossRefGoogle Scholar
  105. 105.
    Rosorius, O., Fries, B., Stauber, R. H., Hirschmann, N., Bevec, D., and Hauber, J. (2000) Human ribosomal protein L5 contains defined nuclear localization and export signals. J. Biol. Chem. 275, 12,061–12,068.PubMedCrossRefGoogle Scholar
  106. 106.
    Deleage, G., Combet, C., Blanchet, C., and Geourjon, C. (2001) ANTHEPROT: an inte-grated protein sequence analysis software with client/server capabilities. Comput. Biol. Med. 31, 259–267.PubMedCrossRefGoogle Scholar
  107. 107.
    Gouet, P. and Courcelle, E. (2002) ENDscript: a workflow to display sequence and struc-ture information. Bioinformatics 18, 767–768.PubMedCrossRefGoogle Scholar
  108. 108.
    Plewniak, F., Bianchetti, L., Brelivet, Y., et al. (2003) PipeAlign: a new toolkit for pro-tein family analysis. Nucleic Acids Res. 31(13), 3829–3832.PubMedCrossRefGoogle Scholar
  109. 110.
    Ilyin, V. A., Pieper, U., Stuart, A. C., Marti-Renom, M. A., McMahan, L., and Sali, A. (2003) ModView, visualization of multiple protein sequences and structures. Bioinformatics 19, 165–166.PubMedCrossRefGoogle Scholar
  110. 111.
    Senger, M.,Flores,T., Glatting, K., Ernst, P., Hotz-Wagenblatt, A., andSuhai, S. (1998) W2H: WWW interface to the GCG sequence analysis package. Bioinformatics 14, 452–457.PubMedCrossRefGoogle Scholar
  111. 112.
    Johnson, J. M., Mason, K., Moallemi, C., Xi, H., Somaroo, S., and Huang, E. S. (2003) Protein family annotation in a multiple alignment viewer. Bioinformatics 19, 544–545.PubMedCrossRefGoogle Scholar

Copyright information

© Humana Press Inc., Totowa, NJ 2005

Authors and Affiliations

  • Julie D. Thompson
    • 1
  • Olivier Poch
    • 1
  1. 1.Laboratorie de Biologie et Genomique StructuralesInstitut de Genetique et de Biologie Moleculaire et CellulaireFrance

Personalised recommendations