Improving Pairwise Sequence Alignment between Distantly Related Proteins

Feng, Jin-an

doi:10.1007/978-1-59745-514-5_16

Jin-an Feng²

Part of the book series: Methods in Molecular Biology™ ((MIMB,volume 395))

1830 Accesses
1 Citations

Summary

Sequence alignment between remotely related proteins has been one of the more difficult problems in structural biology. Improvements have been achieved by incorporating information that enhances the diversity of the substitution matrices. NdPASA is a web-based server that optimizes sequence alignments between proteins sharing low percentages of sequence identity. The program integrates structure information of the template sequence into a global alignment algorithm by employing amino acids’ neighbor-dependent propensities for secondary structure as unique parameters for alignment. NdPASA optimizes alignment by evaluating the likelihood of a residue pair in the query sequence matching against a corresponding residue pair adopting a particular secondary structure in the template sequence. The server is designed to aid homologous protein structure modeling. It is most effective when the structure of the template sequence is known. NdPASA can be accessed online at http://www.fenglab.org/bioserver.html.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.00; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Pearson, W. R. and Lipman, D. J. (1988) Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. USA 85, 2444–2448.
Google Scholar
Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990) J. Mol. Biol. 215, 403–410.
CAS PubMed Google Scholar
Altschul, S. F., Madden, T. L., Schaffer, A. A., et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucl. Acids Res. 25, 3389–3402.
Article CAS PubMed Google Scholar
Chothia, C. and Lesk, A. M. (1986) The relation between the divergence of sequence and structure in proteins. EMBO J. 5, 823–826.
CAS PubMed Google Scholar
Scharf, M., Schneider, R., Casari, G., et al. (1994) GeneQuiz: a workbench for sequence analysis. ISMB 2, 348–353.
CAS PubMed Google Scholar
Abagyan, R. A. and Batalov, S. (1997) Do aligned sequences share the same fold? J. Mol. Biol. 273, 355–368.
Article CAS PubMed Google Scholar
Teichmann, S. A., Chothia, C., and Gerstein, M. (1999) Advances in structural genomics. Curr. Opin. Struct. Biol. 9, 390–399.
Article CAS PubMed Google Scholar
Feng, D. F., Johnson, M. S., and Doolittle, R. F. (1985) Aligning amino acid sequences: comparison of commonly used methods. J. Mol. Evol. 212, 112–125.
Article Google Scholar
Rost, B. (1999) Twilight zone of protein sequence alignments. Protein Eng. 12, 85–94.
Article CAS PubMed Google Scholar
Dayhoff, M., Schwartz, R. M., and Orcutt, B. C. (1978) A model of evolutionary change in proteins, in Atlas of Protein Sequence and Structure, (Dayhoff, M. ed.), National Biomedical Research Foundation, Silver Springs, MD, pp. 345–352.
Google Scholar
Henikoff, S. and Henikoff, J. G. (1992) Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA 89, 10,915–10,919.
Google Scholar
Gribskov, M., McLachlan, A. D., and Eisenberg, D. (1987) Profile analysis: detection of distantly related proteins. Proc. Natl. Acad. Sci. USA 84, 4355–4358.
Article CAS PubMed Google Scholar
Marti-Renom, M. A., Madhusudhan, M. S., and Sali, A. (2004) Alignment of protein sequences by their profiles. Protein Sci. 13, 1071–1087.
Article CAS PubMed Google Scholar
Shi, J., Blundell, T. L., and Mizuguchi, K. (2001) FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. J. Mol. Biol. 310, 243–257.
Article CAS PubMed Google Scholar
Ogata, K., Ohya, M., and Umeyama, H. (1998) Amino acid similarity matrix for homology modeling derived from structural alignment and optimized by the Monte Carlo method. J. Mol. Graph. Model. 16, 178–189.
CAS PubMed Google Scholar
Johnson, M. S. and Overington, J. P. (1993) A structural basis for sequence comparisons An evaluation of scoring methodologies. J. Mol. Biol. 233, 716–738.
Article CAS PubMed Google Scholar
Russell, R. B., Saqi, M. A., Sayle, R. A., Bates, P. A., and Sternberg, M. J. (1997) Recognition of analogous and homologous protein folds: analysis of sequence and structure conservation. J. Mol. Biol. 269, 423–439.
Article CAS PubMed Google Scholar
May, A. C. and Johnson, M. S. (1995) Improved genetic algorithm-based protein structure comparisons: pairwise and multiple superpositions. Protein Eng. 8, 873–882.
Article CAS PubMed Google Scholar
Prlic, A., Domingues, F. S., and Sippl, M. J. (2000) Structure-derived substitution matrices for alignment of distantly related sequences. Protein Eng. 13, 545–550.
Article CAS PubMed Google Scholar
Blake, J. D. and Cohen, F. E. (2001) Pairwise sequence alignment below the twilight zone. J. Mol. Biol. 307, 721–735.
Article CAS PubMed Google Scholar
Yang, A. S. (2002) Structure-dependent sequence alignment for remotely related proteins Bioinformatics 18, 1658–1665.
Article CAS PubMed Google Scholar
Panchenko, A. R. and Bryant, S. H. (2002) A comparison of position-specific score matrices based on sequence and structure alignments. Protein Sci. 11, 361–370.
Article CAS PubMed Google Scholar
Tang, C. L., Xie, L., Koh, I. Y. Y., Posy, S., Alexov, E., and Honig, B. (2003) On the role of structural information in remote homology detection and sequence alignment: New methods using hybrid sequence profiles. J. Mol. Biol. 334, 1043–1062.
Article CAS PubMed Google Scholar
Wang, J. and Feng, J. A. (2005) NdPASA: a novel pair-wise protein sequence alignment that incorporates neighbor-dependent amino acid propensities. Proteins 58, 628–637.
Article CAS PubMed Google Scholar
Crasto, C. J. and Feng, J. A. (2001) Sequence codes for extended conformation: a neighbor-dependent sequence analysis of loops in proteins. Proteins 42, 399–413.
Article CAS PubMed Google Scholar
Wang, J. and Feng, J. A. (2003) Exploring the sequence patterns in the alpha-helices of proteins. Protein Eng. 16, 799–807.
Article CAS PubMed Google Scholar
Berstein, F. C., Koetle, T. F., Williams, G. J. B., et al. (1977) The protein data bank: a computer-based archival file for macromelecular structures. J. Mol. Biol. 112, 535–542.
Article Google Scholar
Wang, G. and Dunbrack, R. L. (2003) PISCES: a protein sequence culling server Bioinformatics 19, 1589–1591.
Article CAS PubMed Google Scholar
Kabsch, W. and Sander, C. (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577–2637.
Article CAS PubMed Google Scholar
Chou, P. Y. and Fasman, G. D. (1974) Conformational parameters for amino acids in helical, β-sheet, and random coil regions calculated from proteins. Biochemistry 15, 211–221.
Article Google Scholar
Murzin, A. G., Brenner, S. E., Hubbard, T., and Chothia, C. (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247, 536–540.
CAS PubMed Google Scholar
Needleman, S. B. and Wunsch, C. D. (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443–453.
Article CAS PubMed Google Scholar
Ginalski, K., Pas, J., Wyrwicz, L. S., von Grotthuss, M., Bujnicki, J. M., and Rychlewski, L. (2003) ORFeus: detection of distant homology using sequence profiles and predicted secondary structure. Nucl. Acids Res. 31, 3804–3807.
Article CAS PubMed Google Scholar
Smith, T. F. and Waterman, M. S. (1981) Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197.
Article CAS PubMed Google Scholar
Ortiz, A. R., Strauss, C. E., and Olmea, O. (2002) MAMMOTH: matching molecular models obtained from theory: an automated method for model comparison. Protein Sci. 11, 2606–2621.
Article CAS PubMed Google Scholar
Bryson, K., McGuffin, L. J., Marsden, R. L., Ward, J. J., Sodhi, J. S., and Jones, D. T. (2005) Protein structure prediction servers at University College London. Nucl. Acids Res. 33, W36–W38.
Article CAS PubMed Google Scholar
Jones, D. T. (1999) GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. J. Mol. Biol. 287, 797–815.
Article CAS PubMed Google Scholar
Kelley, L. A., MacCallum, R. M., and Sternberg, M. J. (2000) Enhanced genome annotation using structural profiles in the program 3D-PSSM. J. Mol. Biol. 299, 523–544.
Article Google Scholar
Wallner, B. and Elofsson, A. (2005) Pcons5: combining consensus, structural evaluation and fold recognition scores. Bioinformatics 21, 4248–4254.
Article CAS PubMed Google Scholar

Download references

Acknowledgments

The author would like to thank Wei Li, Junwen Wang for their contributions in developing NdPASA. The author also thanks for the financial support from the National Institutes of Health (GM54630), the American Cancer Society (PRG9926301GMC), and an appropriation from the commonwealth of Pennsylvania.

Author information

Authors and Affiliations

Department of Chemistry, Center for Biotechnology, Temple University, USA
Jin-an Feng

Authors

Jin-an Feng
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Bioinformatics Program and Department of Microbiology and Immunology, University of Michigan Medical School, Ann Arbor, MI
Nicholas H. Bergman

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Feng, Ja. (2007). Improving Pairwise Sequence Alignment between Distantly Related Proteins. In: Bergman, N.H. (eds) Comparative Genomics. Methods in Molecular Biology™, vol 395. Humana Press. https://doi.org/10.1007/978-1-59745-514-5_16

Download citation

DOI: https://doi.org/10.1007/978-1-59745-514-5_16
Publisher Name: Humana Press
Print ISBN: 978-1-58829-693-1
Online ISBN: 978-1-59745-514-5
eBook Packages: Springer Protocols

Publish with us

Policies and ethics