Abstract
The shortest superstring problem (SSP) is an \(NP\)-hard combinatorial optimization problem which has attracted the interest of many researchers, due to its applications in computational molecular biology problems such as DNA sequencing, and in computer science problems such as string compression. In this paper a new heuristic algorithm for solving large scale instances of the SSP is presented, which outperforms the natural greedy algorithm in the majority of the tested instances. The proposed method is able to provide multiple near-optimum solutions and admits a natural parallel implementation. Extended computational experiments on a set of SSP instances with known optimum solutions indicate that the new method finds the optimum solution in most of the cases, and its average error relative to the optimum is close to zero.
Similar content being viewed by others
References
Armen C, Stein C (1995a) Improved length bounds for the shortest superstring problem. In: Akl S, Dehne F, Jorg-Rudiger S, Santoro N (eds) Algorithms and data structures, volume 955. Lecture notes in computer science. Springer, Berlin, pp 494–505
Armen C, Stein C (1995b) Short superstrings and the structure of overlapping strings. J Comput Biol 2(2):307–332
Armen C, Stein C (1996) A \(2 \frac{2}{3}\)-approximation algorithm for the shortest superstring problem. In: Hirschberg D, Myers G (eds) Combinatorial pattern matching, volume 1075. Lecture notes in computer science. Springer, Berlin, pp 87–101
Arora S, Lund C, Motwani R, Sudan M, Szegedy M (1998) Proof verification and the hardness of approximation problems. J ACM 45:501–555
Bains W, Smith GC (1988) A novel method for nucleic acid sequence determination. J Theor Biol 135(3):303–307
Blum A, Jiang T, Li M, Tromp J, Yannakakis M (1991) Linear approximation of shortest superstrings. In: Proceedings of the twenty-third annual ACM symposium on theory of computing, STOC ’91, New York, pp. 328–336. ACM. ISBN 0-89791-397-3
Breslauer D, Jiang T, Jiang Z (1997) Rotations of periodic strings and short superstrings. J Algorithm 24:340–353
Czumaj A, Ga̧sieniec L, Piotrów M, Rytter W (1994) Parallel and sequential approximation of shortest superstrings. In: Schmidt E, Skyum S (eds) Algorithm theory SWAT ’94, volume 824. Lecture notes in computer science. Springer, Berlin, pp 95–106
Festa P, Resende MGC (2002) GRASP: an annotated bibliography. In: Ribeiro CC, Hansen P (eds) Essays and surveys in metaheuristics. Kluwer Academic Publishers, Norwell, pp 325–367
Frieze A, Szpankowski W (1998) Greedy algorithms for the shortest common superstring that are asymptotically optimal. Algorithmica 21:21–36
Gallant J, Maier D, Storer JA (1980) On finding minimal length superstrings. J Comput Syst Sci 20(1):50–58
Garey MR, Johnson DS (1990) Computers and intractability: a guide to the theory of NP-completeness. W. H. Freeman & Co., New York
Glover F, Laguna M, Rafael M (2000) Fundamentals of scatter search and path relinking. Control Cybern 29(42743):653–684
Glover F, Laguna M (1997) Tabu search. Kluwer Academic Publishers, Norwell
Goldberg MK, Lim DT (2001) A learning algorithm for the shortest superstring problem. In: Proceedings of the atlantic symposium on computational biology and genome information and technology, Durham, NC, pp 171–175
Ilie L, Popescu C (2006) The shortest common superstring problem and viral genome compression. Fundam Inform 73:153–164
Ilie L, Tinta L, Popescu C, Hill KA (2006) Viral genome compression. In: Mao C, Yokomori T (eds) DNA computing, volume 4287. Lecture notes in computer science. Springer, Berlin
Kaplan H, Shafrir N (2005) The greedy algorithm for shortest superstrings. Inform Process Lett 93:13–17
Kosaraju SR, Park JK, Stein C (1994) Long tours and short superstrings. In: Proceedings of the 35th annual symposium on foundations of computer science, Washington, pp 166–177. IEEE Computer Society. ISBN 0-8186-6580-7
Laguna M, Martí R (1999) Grasp and path relinking for 2-layer straight line crossing minimization. INFORMS J Comput 11:44–52
Li M (1990) Towards a DNA sequencing theory (learning a string). Foundations of Computer Science. In: Proceedings of 31st Annual IEEE Symposium, 22–24 Oct 1990
Lin Y, Li J, Shen H, Zhang L, Papasian CJ, Deng H-W (2011) Comparative studies of de novo assembly tools for next-generation sequencing technologies. Bioinformatics 27(15):2031–2037
López-Rodríguez D, Mérida-Casermeiro E (2009) Shortest common superstring problem with discrete neural networks. In: Kolehmainen M, Toivanen P, Beliczynski B (eds) Adaptive and natural computing algorithms, volume 5495. Lecture Notes in computer science. Springer, Berlin, pp 62–71
Ma B (2008) Why greed works for shortest common superstring problem. In: Ferragina P, Landau G (eds) Combinatorial pattern matching, volume 5029. Lecture notes in computer science. Springer, Berlin, pp 244–254. doi:10.1007/978-3-540-69068-9-23
Mayne A, James EB (1975) Information compression by factorising common strings. Comput J 18(2):157–160
Middendorf M (1998) Shortest common superstrings and scheduling with coordinated starting times. Theor Comput Sci 191(1–2):205–214
Miller CE, Tucker AW, Zemlin RA (1960) Integer programming formulation of traveling salesman problems. J ACM 7:326–329
Morozova O, Marra MA (2008) Applications of next-generation sequencing technologies in functional genomics. Genomics 92(5):255–264
Oliveira C, Pardalos P, Resende M (2004) Grasp with path-relinking for the quadratic assignment problem. In: Ribeiro C, Martins S (eds) Experimental and efficient algorithms, volume 3059. Lecture notes in computer science. Springer, Berlin,pp. 356–368. ISBN 978-3-540-22067-1
Ott S (1999) Lower bounds for approximating shortest superstrings over an alphabet of size 2. In: Widmayer P, Neyer G, Eidenbenz S (eds) Graph-theoretic concepts in computer science, volume 1665. Lecture notes in computer science. Springer, Berlin, pp 55–64
Pitsoulis LS, Resende MGC (2002) Greedy randomized adaptive search procedures. In: Pardalos PM, Resende MGC (eds) Handbook of applied optimization. Oxford University Press, Oxford, pp 178–183
Resende MGC, Ribeiro CC (2005) Grasp with path-relinking: recent advances and applications. In: Ibaraki T, Nonobe K, Yagiura M (eds) Metaheuristics: progress as real problem solvers. Springer, Berlin, pp 29–63
Resende MGC, Werneck RF (2004) A hybrid heuristic for the \(p\)-median problem. J Heuristics 10:59–88
Storer JA, Szymanski TG (1978) The macro model for data compression (extended abstract). In: Proceedings of the tenth annual ACM symposium on theory of computing, STOC ’78, New York, pp. 30–39
Storer JA, Szymanski TG (1982) Data compression via textual substitution. J ACM 29:928–951
Sweedyk Z (1999) A \(2\frac{1}{2}\)-approximation algorithm for shortest superstring. SIAM J Comput 29:954–986
Tarhio J, Ukkonen E (1988) A greedy approximation algorithm for constructing shortest common superstrings. Theor Comput Sci 57(1):131–145
Teng S-H, Yao F (1993) Approximating shortest superstrings. In: Proceedings of the 1993 IEEE 34th annual foundations of computer science, pp. 158–165
Yang E, Zhang Z (1999) The shortest common superstring problem: average case analysis for both exact and approximate matching. IEEE Trans Inf Theory 45(6):1867–1886
Zaritsky A, Sipper M (2004) The preservation of favored building blocks in the struggle for fitness: the puzzle algorithm. IEEE Trans Evol Comput 8(5):443–455
Acknowledgments
This research has been co-financed by the European Union (European Social Fund - ESF) and Greek national funds through the Operational Program “Education and Lifelong Learning” of the National Strategic Reference Framework (NSRF)—Research Funding Program: Heraclitus II. Investing in knowledge society through the European Social Fund.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Gevezes, T., Pitsoulis, L. A greedy randomized adaptive search procedure with path relinking for the shortest superstring problem. J Comb Optim 29, 859–883 (2015). https://doi.org/10.1007/s10878-013-9622-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10878-013-9622-z