Converting Between Sequence Formats

  • Cary O’Donnell
Part of the Methods in Molecular Biology book series (MIMB, volume 24)


A “sequence format” is a punctuation style, or defined layout of text, within a computer file that separates a sequence from everything else. It allows computer programs that “understand” the format to distinguish between the sequence and any reference documentation also in the file. Some format definitions extend to the documentation itself (i.e., most database formats), allowing some software to locate specific reference information (e.g., authors, journals, species classification, coding regions).


Output File Sequence Format Sequence File Index File Genetic Computer Group 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Stoehr, P. J. and Cameron, G. N. (1991) The EMBL data library. Nucleic Acids Rex 19, 2227–2230.Google Scholar
  2. 2.
    Burks, C., Cassidy, M., Cinkosky, M. J., Cumella, K. E., Gilna, P., Hayden, J. E.-D., Keen, G. M., Kelley, T. A., Kelly, M., Krrstofferson, D., and Ryals, J. (1991) GenBank. Nucleic Acids Res. 19, 2221–2225.PubMedGoogle Scholar
  3. 3.
    Devereux, J., Haeberli, P., and Smithies, 0. (1984) A comprehensive set of sequence analysis programs for the VAX Nucleic Acids Res. 12, 387–395.PubMedCrossRefGoogle Scholar
  4. 4.
    Orcutt, B. C, George D. G, Fredrickson, J. A., and Dayhoff, M. 0. (1982) Nucleic acid sequence database computer system. Nucleic Aczds Res. 10, 157–174.CrossRefGoogle Scholar
  5. 5.
    Orcutt, B. C., George D. G., and Dayhoff, M. 0. (1983) Protein and nucleic acid sequence database computer systems. Ann. Rev. Biophys. Bioeng. 12, 419–441.CrossRefGoogle Scholar
  6. 6.
    Hunt, L. T. (1990) in Protein Identification Resource Newsletter, vol.9, May. National Biomedical Research Foundation, Washington, DC.Google Scholar
  7. 7.
    Pearson, W. R. and Lipman, D. J. (1988) Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. USA 85, 2444–2448.PubMedCrossRefGoogle Scholar
  8. 8.
    Staden, R. (1986) The current status and portability of our sequence handling software. Nucleic Acids Res. 14(1).Google Scholar
  9. 9.
    Gilbert, D. G. (1989) ReadSeq, C and Pascal routmes for convertmg among nucleic acid & protein sequence file formats, suitable for various computers. Published electronically on the Internet, available via anonymous ftp to Scholar
  10. 10.
    Barker, W. C., George, D. G., Hunt, L. T., and Garavelli, J. S. (1991) The PIR protein sequence database. Nucleic Acids Res. 19, 2231–2236.PubMedGoogle Scholar
  11. 11.
    Bairoch A. and Boeckmann B. (1991) The SWISS-PROT protein sequence data bank. Nucleic Acids Res. 19, 2247–2249.PubMedGoogle Scholar

Copyright information

© Humana Press Inc., Totowa, NJ 1994

Authors and Affiliations

  • Cary O’Donnell
    • 1
  1. 1.AFRC Computing DivisionHarpendenUK

Personalised recommendations