Abstract
Given a finite collection of strings of letters from a fixed alphabet, it is of interest, in the contexts of data compression and DNA sequencing, to find the length of the shortest string which contains each of the given strings as a consecutive substring. In order to analyze the average behavior of the optimal superstring length, substrings with a specified collection of lengths are considered with the letters selected independently at random. An asymptotic expression, as the collection of lengths becomes large, is obtained for the savings from compression, that is, the difference between the uncompressed (concatenated) length and the optimal superstring length.
Research supported by NSF grant DMS-9206139
The author would like to thank M. Waterman and G. Benson for helpful discussions; M. Waterman in particular suggested considering problems of this type.
Preview
Unable to display preview. Download preview PDF.
References
Alexander, K.S. Shortest common superstrings of random strings. (1993) Preprint.
Arratia, R. and Waterman, M.S. Critical phenomena in sequence matching. Ann. Probability 13 (1985) 1236–1249.
Blum, A., Jiang, T., Li, M., Tromp, J. and Yannakakis, M. Linear approximation of shortest superstrings. Proc. 23rd ACM Symp. on Theory of Computing, (1991) 328–336.
Peltola, H., Söderlund, H., Tarhio, J. and Ukkonen, E. Algorithms for some string matching problems arising in molecular genetics. In: Information Processing 83 (Proc. of the IFIP Congress 1983), R.E.A. Mason, ed. North-Holland, Amsterdam, (1983) 53–64.
Tarhio, J. and Ukkonen, E. A greedy approximation algorithm for constructing shortest common superstrings. Theor. Comp. Sci. 57 (1986) 131–145.
Turner, J. Approximation algorithms for the shortest common superstring problem. Information and Computation 83 (1989) 1–20.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1994 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Alexander, K.S. (1994). Shortest common superstrings for strings of random letters. In: Crochemore, M., Gusfield, D. (eds) Combinatorial Pattern Matching. CPM 1994. Lecture Notes in Computer Science, vol 807. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-58094-8_15
Download citation
DOI: https://doi.org/10.1007/3-540-58094-8_15
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-58094-2
Online ISBN: 978-3-540-48450-9
eBook Packages: Springer Book Archive