Skip to main content
Log in

A greedy randomized adaptive search procedure with path relinking for the shortest superstring problem

  • Published:
Journal of Combinatorial Optimization Aims and scope Submit manuscript

Abstract

The shortest superstring problem (SSP) is an \(NP\)-hard combinatorial optimization problem which has attracted the interest of many researchers, due to its applications in computational molecular biology problems such as DNA sequencing, and in computer science problems such as string compression. In this paper a new heuristic algorithm for solving large scale instances of the SSP is presented, which outperforms the natural greedy algorithm in the majority of the tested instances. The proposed method is able to provide multiple near-optimum solutions and admits a natural parallel implementation. Extended computational experiments on a set of SSP instances with known optimum solutions indicate that the new method finds the optimum solution in most of the cases, and its average error relative to the optimum is close to zero.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  • Armen C, Stein C (1995a) Improved length bounds for the shortest superstring problem. In: Akl S, Dehne F, Jorg-Rudiger S, Santoro N (eds) Algorithms and data structures, volume 955. Lecture notes in computer science. Springer, Berlin, pp 494–505

    Google Scholar 

  • Armen C, Stein C (1995b) Short superstrings and the structure of overlapping strings. J Comput Biol 2(2):307–332

    Article  Google Scholar 

  • Armen C, Stein C (1996) A \(2 \frac{2}{3}\)-approximation algorithm for the shortest superstring problem. In: Hirschberg D, Myers G (eds) Combinatorial pattern matching, volume 1075. Lecture notes in computer science. Springer, Berlin, pp 87–101

    Google Scholar 

  • Arora S, Lund C, Motwani R, Sudan M, Szegedy M (1998) Proof verification and the hardness of approximation problems. J ACM 45:501–555

    Article  MATH  MathSciNet  Google Scholar 

  • Bains W, Smith GC (1988) A novel method for nucleic acid sequence determination. J Theor Biol 135(3):303–307

    Article  Google Scholar 

  • Blum A, Jiang T, Li M, Tromp J, Yannakakis M (1991) Linear approximation of shortest superstrings. In: Proceedings of the twenty-third annual ACM symposium on theory of computing, STOC ’91, New York, pp. 328–336. ACM. ISBN 0-89791-397-3

  • Breslauer D, Jiang T, Jiang Z (1997) Rotations of periodic strings and short superstrings. J Algorithm 24:340–353

    Article  MATH  MathSciNet  Google Scholar 

  • Czumaj A, Ga̧sieniec L, Piotrów M, Rytter W (1994) Parallel and sequential approximation of shortest superstrings. In: Schmidt E, Skyum S (eds) Algorithm theory SWAT ’94, volume 824. Lecture notes in computer science. Springer, Berlin, pp 95–106

    Google Scholar 

  • Festa P, Resende MGC (2002) GRASP: an annotated bibliography. In: Ribeiro CC, Hansen P (eds) Essays and surveys in metaheuristics. Kluwer Academic Publishers, Norwell, pp 325–367

    Chapter  Google Scholar 

  • Frieze A, Szpankowski W (1998) Greedy algorithms for the shortest common superstring that are asymptotically optimal. Algorithmica 21:21–36

    Article  MATH  MathSciNet  Google Scholar 

  • Gallant J, Maier D, Storer JA (1980) On finding minimal length superstrings. J Comput Syst Sci 20(1):50–58

    Article  MATH  MathSciNet  Google Scholar 

  • Garey MR, Johnson DS (1990) Computers and intractability: a guide to the theory of NP-completeness. W. H. Freeman & Co., New York

    Google Scholar 

  • Glover F, Laguna M, Rafael M (2000) Fundamentals of scatter search and path relinking. Control Cybern 29(42743):653–684

    MATH  Google Scholar 

  • Glover F, Laguna M (1997) Tabu search. Kluwer Academic Publishers, Norwell

    Book  MATH  Google Scholar 

  • Goldberg MK, Lim DT (2001) A learning algorithm for the shortest superstring problem. In: Proceedings of the atlantic symposium on computational biology and genome information and technology, Durham, NC, pp 171–175

  • Ilie L, Popescu C (2006) The shortest common superstring problem and viral genome compression. Fundam Inform 73:153–164

    MATH  MathSciNet  Google Scholar 

  • Ilie L, Tinta L, Popescu C, Hill KA (2006) Viral genome compression. In: Mao C, Yokomori T (eds) DNA computing, volume 4287. Lecture notes in computer science. Springer, Berlin

    Google Scholar 

  • Kaplan H, Shafrir N (2005) The greedy algorithm for shortest superstrings. Inform Process Lett 93:13–17

    Article  MATH  MathSciNet  Google Scholar 

  • Kosaraju SR, Park JK, Stein C (1994) Long tours and short superstrings. In: Proceedings of the 35th annual symposium on foundations of computer science, Washington, pp 166–177. IEEE Computer Society. ISBN 0-8186-6580-7

  • Laguna M, Martí R (1999) Grasp and path relinking for 2-layer straight line crossing minimization. INFORMS J Comput 11:44–52

    Article  MATH  Google Scholar 

  • Li M (1990) Towards a DNA sequencing theory (learning a string). Foundations of Computer Science. In: Proceedings of 31st Annual IEEE Symposium, 22–24 Oct 1990

  • Lin Y, Li J, Shen H, Zhang L, Papasian CJ, Deng H-W (2011) Comparative studies of de novo assembly tools for next-generation sequencing technologies. Bioinformatics 27(15):2031–2037

    Article  Google Scholar 

  • López-Rodríguez D, Mérida-Casermeiro E (2009) Shortest common superstring problem with discrete neural networks. In: Kolehmainen M, Toivanen P, Beliczynski B (eds) Adaptive and natural computing algorithms, volume 5495. Lecture Notes in computer science. Springer, Berlin, pp 62–71

    Google Scholar 

  • Ma B (2008) Why greed works for shortest common superstring problem. In: Ferragina P, Landau G (eds) Combinatorial pattern matching, volume 5029. Lecture notes in computer science. Springer, Berlin, pp 244–254. doi:10.1007/978-3-540-69068-9-23

    Google Scholar 

  • Mayne A, James EB (1975) Information compression by factorising common strings. Comput J 18(2):157–160

    Article  MATH  Google Scholar 

  • Middendorf M (1998) Shortest common superstrings and scheduling with coordinated starting times. Theor Comput Sci 191(1–2):205–214

    Article  MATH  MathSciNet  Google Scholar 

  • Miller CE, Tucker AW, Zemlin RA (1960) Integer programming formulation of traveling salesman problems. J ACM 7:326–329

    Article  MATH  MathSciNet  Google Scholar 

  • Morozova O, Marra MA (2008) Applications of next-generation sequencing technologies in functional genomics. Genomics 92(5):255–264

    Article  Google Scholar 

  • Oliveira C, Pardalos P, Resende M (2004) Grasp with path-relinking for the quadratic assignment problem. In: Ribeiro C, Martins S (eds) Experimental and efficient algorithms, volume 3059. Lecture notes in computer science. Springer, Berlin,pp. 356–368. ISBN 978-3-540-22067-1

  • Ott S (1999) Lower bounds for approximating shortest superstrings over an alphabet of size 2. In: Widmayer P, Neyer G, Eidenbenz S (eds) Graph-theoretic concepts in computer science, volume 1665. Lecture notes in computer science. Springer, Berlin, pp 55–64

    Google Scholar 

  • Pitsoulis LS, Resende MGC (2002) Greedy randomized adaptive search procedures. In: Pardalos PM, Resende MGC (eds) Handbook of applied optimization. Oxford University Press, Oxford, pp 178–183

    Google Scholar 

  • Resende MGC, Ribeiro CC (2005) Grasp with path-relinking: recent advances and applications. In: Ibaraki T, Nonobe K, Yagiura M (eds) Metaheuristics: progress as real problem solvers. Springer, Berlin, pp 29–63

    Chapter  Google Scholar 

  • Resende MGC, Werneck RF (2004) A hybrid heuristic for the \(p\)-median problem. J Heuristics 10:59–88

    Article  MATH  Google Scholar 

  • Storer JA, Szymanski TG (1978) The macro model for data compression (extended abstract). In: Proceedings of the tenth annual ACM symposium on theory of computing, STOC ’78, New York, pp. 30–39

  • Storer JA, Szymanski TG (1982) Data compression via textual substitution. J ACM 29:928–951

    Article  MATH  MathSciNet  Google Scholar 

  • Sweedyk Z (1999) A \(2\frac{1}{2}\)-approximation algorithm for shortest superstring. SIAM J Comput 29:954–986

    Article  MathSciNet  Google Scholar 

  • Tarhio J, Ukkonen E (1988) A greedy approximation algorithm for constructing shortest common superstrings. Theor Comput Sci 57(1):131–145

    Article  MATH  MathSciNet  Google Scholar 

  • Teng S-H, Yao F (1993) Approximating shortest superstrings. In: Proceedings of the 1993 IEEE 34th annual foundations of computer science, pp. 158–165

  • Yang E, Zhang Z (1999) The shortest common superstring problem: average case analysis for both exact and approximate matching. IEEE Trans Inf Theory 45(6):1867–1886

    Article  MATH  Google Scholar 

  • Zaritsky A, Sipper M (2004) The preservation of favored building blocks in the struggle for fitness: the puzzle algorithm. IEEE Trans Evol Comput 8(5):443–455

    Article  Google Scholar 

Download references

Acknowledgments

This research has been co-financed by the European Union (European Social Fund - ESF) and Greek national funds through the Operational Program “Education and Lifelong Learning” of the National Strategic Reference Framework (NSRF)—Research Funding Program: Heraclitus II. Investing in knowledge society through the European Social Fund.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Theodoros Gevezes.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gevezes, T., Pitsoulis, L. A greedy randomized adaptive search procedure with path relinking for the shortest superstring problem. J Comb Optim 29, 859–883 (2015). https://doi.org/10.1007/s10878-013-9622-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10878-013-9622-z

Keywords

Navigation