Abstract
We present a simple algorithm for construction of the DNA sequence from a set of fragments generated in a shotgun sequencing project. The algorithm is based on rigorous detection of overlaps among fragments. We report assembly results of the algorithm on two genomic data sets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Chang, W. I. and Lawler, E. L., Approximate string matching in sublinear expected time, 31st IEEE Symp. Found. Comput. Sci., 116–124, 1990.
Edwards A., Voss H., Rice P., Civitello A., Stegemann J., Schwager C., Zimmermann J., Erfle H., Caskey, C. T. and Ansorge, W., Automated DNA sequencing of the human HPRT locus, Genomics 6, 593–608, 1990.
[3]Gallant J., Maier, D. and Storer, J., On fìnding minimal length superstring, J. Comput. Sys. Sci. 20, 50–58, 1980.
Hirschberg, D.S., A linear space algorithm for computing maximal common subsequences,Comm. ACM 18, 341–343, 1975.
Huang, X., A contig assembly program based on sensitive detection of fragment overlaps, Genomics 14, 18–25, 1992.
[6]Huang, X., On global sequence alignment, Comput. Applic. Biosci. 10, 227–235, 1994.
Kececioglu, J. D. and Myers, E. W., Combinatorial algorithms for DNA sequence assembly, Algorithmica 13, 7–51, 1995.
Myers, E. W., Incremental alignment algorithms and their applications, Technical Report 86-2, Department of Computer Science, The University of Arizona, Tucson, AZ, 1986.
[9]Myers, E. W. and Miller, W., Optimal alignments in linear space, Comput. Applic. Biosci. 4, 11–17, 1988.
Peltola H., Soderlund H., Tarhio, J. and Ukkonen, E., Algorithms for some string matching problems arising in molecular genetics,Information Processing 83 (Proc. IFIP Congress), 53–64, 1983.
[11]Peltola H., Soderlund, H. and Ukkonen E., Seqaid: a DNA sequence assembling program based on a mathematical model, Nucleic Acids Res. 12, 307–321, 1984.
Seto D., Koop, B. F. and Hood, L.,, An experimentally derived data set constructed for testing large-scale DNA sequence assembly algorithms, Genomics 15, 673–676, 1993.
[13]Smith, T. F. and Waterman, M. S., Identifìcation of common molecular subsequences, J. Mol. Biol. 147, 195–197, 1981.
Staden R., A new computer method for the storage and manipulation of DNA gel reading data, Nucleic Acids Res. 8, 3673–3694, 1980.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1996 Springer Science+Business Media New York
About this paper
Cite this paper
Huang, X. (1996). Assembly of Shotgun Sequencing Data. In: Speed, T., Waterman, M.S. (eds) Genetic Mapping and DNA Sequencing. The IMA Volumes in Mathematics and its Applications, vol 81. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-0751-1_11
Download citation
DOI: https://doi.org/10.1007/978-1-4612-0751-1_11
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4612-6890-1
Online ISBN: 978-1-4612-0751-1
eBook Packages: Springer Book Archive