Skip to main content

FASTA Servers for Sequence Similarity Search

  • Protocol

Part of the book series: Springer Protocols Handbooks ((SPH))

Abstract

In the last few years, many eukaryotic (including human and mouse) and prokaryotic genomes have been either completely sequenced or are under sequencing (13). In the coming 5–10 yr, most of the known organisms will have been sequenced. This has and will lead to exponential growth in nucleotide and protein databases over the years; for example, International Nucleotide Sequence Databases (INSD), composed of DDBJ (http://www.ddbj.nig.ac.jp/), EMBL Bank (http://www.ebi.ac.uk/embl/), and GenBank (http://www.ncbi.nlm.nih.gov/), had released more than 30 million entries by the end of 2003 (4). The availability of these increasingly expanding databases poses a major challenge to bioinformatics experts for developing effective programs or Web servers that extract maximum information from these databases. Database similarity search is perhaps the fastest, cheapest, and most powerful such experiment a biologist can conduct. As the databases become more complete, a sequence similarity search is more likely to reveal database sequences with statistically significant similarity, and thus inferred homology, to a query sequence. Though sharing significant sequence similarity is no guarantee of shared function, the availability of similar sequences is proving useful in discovering relationships between newly sequenced proteins or genes and various classes in the databases (57).

This is a preview of subscription content, log in via an institution.

Buying options

Protocol
USD   49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Springer Nature is developing a new tool to find and evaluate Protocols. Learn more

References

  1. Venter, J. C., Adams, M. D., Myers, E. W., et al. (2001) The sequence of the human genome. Science 291, 1304–1351.

    Article  PubMed  CAS  Google Scholar 

  2. Lander, E. S., Linton, L. M., Birren, B., et al. (2001) Initial sequencing and analysis of the human genome. Nature 409, 860–921.

    Article  PubMed  CAS  Google Scholar 

  3. Waterson, R. H., Lindblad-Toh, K., Birney, E., et al. (2002) Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–562.

    Article  Google Scholar 

  4. Miyazaki, A., Sugawara, H., Gojobori, T., and Tateno, Y. (2003) DNA DataBank of Japan (DDBJ) in XML. Nucleic Acids Res. 31, 13–16.

    Article  PubMed  CAS  Google Scholar 

  5. Manuel, A., Beaupain, D., Romeo, P.H., and Raich, N. (2000) Molecular characterization of a novel gene family (PHTF) conserved from Drosophila to mammals. Genomics 64, 216–220.

    Article  PubMed  CAS  Google Scholar 

  6. Soliveri, J. A., Gomez, J., Bishai, W.R., and Chater, K. F. (2000) Multiple paralogous genes related to the Streppomyces coelicolor developmental regulatory gene whiB are present in Streppomyces and other actinomycetes. Microbiology 146, 333–343.

    PubMed  CAS  Google Scholar 

  7. Komeda, H. and Asano, Y. (2003) Genes for an alkaline D-stereospecific endopeppidase and its homolog are located in tandem on Bacillus cereus genome. FEMS Microbiol Lett. 228, 1–9.

    Article  PubMed  CAS  Google Scholar 

  8. Gibbs, A.J. and McIntyre, G. A. (1970), The diagram, a method for comparing sequences. Its use with amino acid and nucleotide sequences. Eur. J. Biochem. 16, 1–11.

    Article  Google Scholar 

  9. Needleman, S. and Wunsch, C. (1970) A general method applicable to search for similarities in the amino acid sequences of two proteins. J. Mol. Biol. 48, 444–453.

    Article  Google Scholar 

  10. Smith, T. and Waterman, M. (1981) Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197.

    Article  Google Scholar 

  11. Pearson, W.R. and Lipman, D. J. (1988) Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. USA 85, 2444–2448.

    Article  Google Scholar 

  12. Lipman, D.J. and Pearson, W. R. (1985) Rapid and sensitive protein similarity searches. Science 227, 1435–1441.

    Article  PubMed  CAS  Google Scholar 

  13. Altschul, S. F., Gish, W., Miller, W., Myers, E.W., and Lipman, D. J. (1990) Basic local alignment search tool. J. Mol. Biol. 215, 403–410.

    PubMed  CAS  Google Scholar 

  14. Altschul, S. F., Madden, T. L., Schaffer, A. A., et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389-3402.

    Google Scholar 

  15. Pearson, W. (2000), Flexible sequence similarity searching with FASTA3 program package. “In Bioinformatics Methods and Protocols”, Misener, S., and Krawety, S. A. (eds.), Humana Press, Inc., Totowa, NJ, pp. 185–219.

    Google Scholar 

  16. Wilbur, W.J. and Lipman, D. J. (1983), Rapid similarity searches of nucleic acid and protein data banks. Proc. Natl. Acad. Sci., USA 80, 726–730.

    Article  PubMed  CAS  Google Scholar 

  17. Pearson, W. R. (1990) Rapid and sensitive sequence comparison with FASTP and FASTA. Methods Enzymol. 183, 63–98.

    Article  Google Scholar 

  18. Anderson, I. and Brass, A. (1998), Searching DNA databases for similarities to DNA sequences: when is a match significant? Bioinformatics 14, 349–356.

    Article  PubMed  CAS  Google Scholar 

  19. Pearson, W. R. (1995) Comparison of methods for searching protein sequence databases. Protein Sci. 4, 1150–1160.

    Google Scholar 

  20. Pearson, W. R. (1996) Effective protein sequence comparison. Methods Enzymol. 266, 227–258.

    Article  Google Scholar 

  21. Pearson, W. R. (1998), Empirical statistical estimates for sequence similarity searches. J. Mol. Biol. 276, 71–84.

    Article  Google Scholar 

  22. Miller, W. (2000), Comparison of genomic DNA sequences: solved and unsolved problems. Bioinformatics 17, 391–397.

    Article  Google Scholar 

  23. Issac, B. and Raghava, G. P. S. (2002), GWFASTA: a server for FASTA search in eukaryotic and microbial genomes. BioTechniques 33, 548–556.

    Google Scholar 

  24. Thompson, J. D., Higgins, D.G., and Gibson, T. J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucl. Acids Res. 22, 4673–4680.

    Article  Google Scholar 

  25. Brown, N. P., Leroy, C., and Sander, C. (1998) MView: a Web-compatible database search or multiple alignment viewer. Bioinformatics 14, 380–381.

    Article  Google Scholar 

  26. Gogarten, J.P. and Olendzenski, L. (1999) Orthologs, paralogs and genome composition. Curr. Opin. Genet. Dev. 9, 630–636.

    Article  PubMed  CAS  Google Scholar 

  27. Raghava, G.P. S. (2001), A graphical Web server for the analysis of protein sequences and alignment. Biotech. Software and Internet Report. 2, 255–258.

    Google Scholar 

  28. Livingstone, C.D. and Barton, G. J. (1993) Protein sequence alignments: a strategy for the hierarchical analysis of residue conservation. Comput. Appl. Biosci. 9, 745–756.

    PubMed  CAS  Google Scholar 

  29. Barton, G. J. (1993) Alscripp: a tool to format multiple sequence alignments. Prot. Eng. 6, 37–40.

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Humana Press Inc., Totowa, NJ

About this protocol

Cite this protocol

Issac, B., Raghava, G.P.S. (2005). FASTA Servers for Sequence Similarity Search. In: Walker, J.M. (eds) The Proteomics Protocols Handbook. Springer Protocols Handbooks. Humana Press. https://doi.org/10.1385/1-59259-890-0:503

Download citation

  • DOI: https://doi.org/10.1385/1-59259-890-0:503

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-58829-343-5

  • Online ISBN: 978-1-59259-890-8

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics