FASTA Servers for Sequence Similarity Search

Issac, Biju; Raghava, Gajendra P. S.

doi:10.1385/1-59259-890-0:503

FASTA Servers for Sequence Similarity Search

Biju Issac² &
Gajendra P. S. Raghava²

Protocol

4121 Accesses
1 Citations

Part of the book series: Springer Protocols Handbooks ((SPH))

Abstract

In the last few years, many eukaryotic (including human and mouse) and prokaryotic genomes have been either completely sequenced or are under sequencing (1–3). In the coming 5–10 yr, most of the known organisms will have been sequenced. This has and will lead to exponential growth in nucleotide and protein databases over the years; for example, International Nucleotide Sequence Databases (INSD), composed of DDBJ (http://www.ddbj.nig.ac.jp/), EMBL Bank (http://www.ebi.ac.uk/embl/), and GenBank (http://www.ncbi.nlm.nih.gov/), had released more than 30 million entries by the end of 2003 (4). The availability of these increasingly expanding databases poses a major challenge to bioinformatics experts for developing effective programs or Web servers that extract maximum information from these databases. Database similarity search is perhaps the fastest, cheapest, and most powerful such experiment a biologist can conduct. As the databases become more complete, a sequence similarity search is more likely to reveal database sequences with statistically significant similarity, and thus inferred homology, to a query sequence. Though sharing significant sequence similarity is no guarantee of shared function, the availability of similar sequences is proving useful in discovering relationships between newly sequenced proteins or genes and various classes in the databases (5–7).

This is a preview of subscription content, log in via an institution.

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Hardcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Springer Nature is developing a new tool to find and evaluate Protocols. Learn more

References

Venter, J. C., Adams, M. D., Myers, E. W., et al. (2001) The sequence of the human genome. Science 291, 1304–1351.
Article PubMed CAS Google Scholar
Lander, E. S., Linton, L. M., Birren, B., et al. (2001) Initial sequencing and analysis of the human genome. Nature 409, 860–921.
Article PubMed CAS Google Scholar
Waterson, R. H., Lindblad-Toh, K., Birney, E., et al. (2002) Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–562.
Article Google Scholar
Miyazaki, A., Sugawara, H., Gojobori, T., and Tateno, Y. (2003) DNA DataBank of Japan (DDBJ) in XML. Nucleic Acids Res. 31, 13–16.
Article PubMed CAS Google Scholar
Manuel, A., Beaupain, D., Romeo, P.H., and Raich, N. (2000) Molecular characterization of a novel gene family (PHTF) conserved from Drosophila to mammals. Genomics 64, 216–220.
Article PubMed CAS Google Scholar
Soliveri, J. A., Gomez, J., Bishai, W.R., and Chater, K. F. (2000) Multiple paralogous genes related to the Streppomyces coelicolor developmental regulatory gene whiB are present in Streppomyces and other actinomycetes. Microbiology 146, 333–343.
PubMed CAS Google Scholar
Komeda, H. and Asano, Y. (2003) Genes for an alkaline D-stereospecific endopeppidase and its homolog are located in tandem on Bacillus cereus genome. FEMS Microbiol Lett. 228, 1–9.
Article PubMed CAS Google Scholar
Gibbs, A.J. and McIntyre, G. A. (1970), The diagram, a method for comparing sequences. Its use with amino acid and nucleotide sequences. Eur. J. Biochem. 16, 1–11.
Article Google Scholar
Needleman, S. and Wunsch, C. (1970) A general method applicable to search for similarities in the amino acid sequences of two proteins. J. Mol. Biol. 48, 444–453.
Article Google Scholar
Smith, T. and Waterman, M. (1981) Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197.
Article Google Scholar
Pearson, W.R. and Lipman, D. J. (1988) Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. USA 85, 2444–2448.
Article Google Scholar
Lipman, D.J. and Pearson, W. R. (1985) Rapid and sensitive protein similarity searches. Science 227, 1435–1441.
Article PubMed CAS Google Scholar
Altschul, S. F., Gish, W., Miller, W., Myers, E.W., and Lipman, D. J. (1990) Basic local alignment search tool. J. Mol. Biol. 215, 403–410.
PubMed CAS Google Scholar
Altschul, S. F., Madden, T. L., Schaffer, A. A., et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389-3402.
Google Scholar
Pearson, W. (2000), Flexible sequence similarity searching with FASTA3 program package. “In Bioinformatics Methods and Protocols”, Misener, S., and Krawety, S. A. (eds.), Humana Press, Inc., Totowa, NJ, pp. 185–219.
Google Scholar
Wilbur, W.J. and Lipman, D. J. (1983), Rapid similarity searches of nucleic acid and protein data banks. Proc. Natl. Acad. Sci., USA 80, 726–730.
Article PubMed CAS Google Scholar
Pearson, W. R. (1990) Rapid and sensitive sequence comparison with FASTP and FASTA. Methods Enzymol. 183, 63–98.
Article Google Scholar
Anderson, I. and Brass, A. (1998), Searching DNA databases for similarities to DNA sequences: when is a match significant? Bioinformatics 14, 349–356.
Article PubMed CAS Google Scholar
Pearson, W. R. (1995) Comparison of methods for searching protein sequence databases. Protein Sci. 4, 1150–1160.
Google Scholar
Pearson, W. R. (1996) Effective protein sequence comparison. Methods Enzymol. 266, 227–258.
Article Google Scholar
Pearson, W. R. (1998), Empirical statistical estimates for sequence similarity searches. J. Mol. Biol. 276, 71–84.
Article Google Scholar
Miller, W. (2000), Comparison of genomic DNA sequences: solved and unsolved problems. Bioinformatics 17, 391–397.
Article Google Scholar
Issac, B. and Raghava, G. P. S. (2002), GWFASTA: a server for FASTA search in eukaryotic and microbial genomes. BioTechniques 33, 548–556.
Google Scholar
Thompson, J. D., Higgins, D.G., and Gibson, T. J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucl. Acids Res. 22, 4673–4680.
Article Google Scholar
Brown, N. P., Leroy, C., and Sander, C. (1998) MView: a Web-compatible database search or multiple alignment viewer. Bioinformatics 14, 380–381.
Article Google Scholar
Gogarten, J.P. and Olendzenski, L. (1999) Orthologs, paralogs and genome composition. Curr. Opin. Genet. Dev. 9, 630–636.
Article PubMed CAS Google Scholar
Raghava, G.P. S. (2001), A graphical Web server for the analysis of protein sequences and alignment. Biotech. Software and Internet Report. 2, 255–258.
Google Scholar
Livingstone, C.D. and Barton, G. J. (1993) Protein sequence alignments: a strategy for the hierarchical analysis of residue conservation. Comput. Appl. Biosci. 9, 745–756.
PubMed CAS Google Scholar
Barton, G. J. (1993) Alscripp: a tool to format multiple sequence alignments. Prot. Eng. 6, 37–40.
Article CAS Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Microbial Technology, Chandigarh, India
Biju Issac & Gajendra P. S. Raghava

Authors

Biju Issac
View author publications
You can also search for this author in PubMed Google Scholar
Gajendra P. S. Raghava
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Hertfordshire, Hatfield, UK
John M. Walker

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Issac, B., Raghava, G.P.S. (2005). FASTA Servers for Sequence Similarity Search. In: Walker, J.M. (eds) The Proteomics Protocols Handbook. Springer Protocols Handbooks. Humana Press. https://doi.org/10.1385/1-59259-890-0:503

Download citation

DOI: https://doi.org/10.1385/1-59259-890-0:503
Publisher Name: Humana Press
Print ISBN: 978-1-58829-343-5
Online ISBN: 978-1-59259-890-8
eBook Packages: Springer Protocols

Publish with us

Policies and ethics