Skip to main content

Popping Superbubbles and Discovering Clumps: Recent Developments in Biological Sequence Analysis

  • Conference paper
WALCOM: Algorithms and Computation (WALCOM 2016)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9627))

Included in the following conference series:

Abstract

The information that can be inferred or predicted from knowing the genomic sequence of an organism is astonishing. String algorithms are critical to this process. This paper provides an overview of two particular problems that arise during computational molecular biology research, and recent algorithmic developments in solving them.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Balasubramanian, S., Klenerman, D., Barnes, C., Osborne, M.: Patent US20077232656 (2007)

    Google Scholar 

  2. Bassino, F., Clément, J., Fayolle, J., Nicodème, P.: Constructions for clumps statistics. CoRR abs/0804.3671 (2008). http://arxiv.org/abs/0804.3671

  3. Batzoglou, S.: Algorithmic challenges in mammalian genome sequence assembly. In: Dunn, M., Jorde, L., Little, P., Subramaniam, S. (eds.) Encyclopedia of Genomics, Proteomics and Bioinformatics. Wiley, Hoboken (New Jersey) (2005)

    Google Scholar 

  4. Boeva, V., Clément, J., Régnier, M., Vandenbogaert, M.: Assessing the significance of sets of words. In: Apostolico, A., Crochemore, M., Park, K. (eds.) CPM 2005. LNCS, vol. 3537, pp. 358–370. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  5. Brankovic, L., Iliopoulos, C.S., Kundu, R., Mohamed, M., Pissis, S.P., Vayani, F.: Linear-time superbubble identification algorithm for genome assembly. Theor. Comput. Sci. 609(Part 2), 374–383 (2016). http://www.sciencedirect.com/science/article/pii/S0304397515009147

    Google Scholar 

  6. de Bruijn, N.G.: A combinatorial problem. Koninklijke Nederlandse Akademie v. Wetenschappen 49, 758–764 (1946)

    MathSciNet  MATH  Google Scholar 

  7. Butler, J., MacCallum, I., Kleber, M., Shlyakhter, I.A., Belmonte, M.K., Lander, E.S., Nusbaum, C., Jaffe, D.B.: ALLPATHS: de novo assembly of whole-genome shotgun microreads. Genome Res. 18(5), 810–820 (2008)

    Article  Google Scholar 

  8. Compeau, P.: Bioinformatics Algorithms: An Active Learning Approach. Active Learning Publishers, La Jolla (2014)

    Google Scholar 

  9. Crochemore, M., Hancart, C., Lecroq, T.: Algorithms on Strings, p. 392. Cambridge University Press, Cambridge (2007)

    Google Scholar 

  10. Ehlers, T., Manea, F., Mercaş, R., Nowotka, D.: \(k\)-abelian pattern matching. J. Discrete Algorithms 34, 37–48 (2015)

    Google Scholar 

  11. Fischer, J.: Inducing the LCP-array. In: Dehne, F., Iacono, J., Sack, J.-R. (eds.) WADS 2011. LNCS, vol. 6844, pp. 374–385. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  12. Gao, F., Zhang, C.T.: Ori-finder: a web-based system for finding orics in unannotated bacterial genomes. BMC Bioinform. 9(1), 79 (2008)

    Article  Google Scholar 

  13. Grossi, R., Iliopoulos, C.S., Mercaş, R., Pisanti, N., Pissis, S.P., Retha, A., Vayani, F.: Circular sequence comparison with \(q\)-grams. In: Pop, M., Touzet, H. (eds.) WABI 2015. LNCS, vol. 9289, pp. 203–216. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

  14. Kvietikova, I., Wenger, R.H., Marti, H.H., Gassmann, M.: The transcription factors ATF-1 and CREB-1 bind constitutively to the hypoxia-inducible factor-1 (HIF-1) DNA recognition site. Nucleic Acids Res. 23(22), 4542–4550 (1995)

    Article  Google Scholar 

  15. Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W., et al.: Initial sequencing and analysis of the human genome. Nature 409(6822), 860–921 (2001)

    Article  Google Scholar 

  16. Leonard, A.C., Grimwade, J.E.: Building a bacterial orisome: emergence of new regulatory features for replication origin unwinding. Mol. Microbiol. 55(4), 978–985 (2005)

    Article  Google Scholar 

  17. Manber, U., Myers, G.: Suffix arrays: a new method for on-line string searches. SIAM J. Comput. 22(5), 935–948 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  18. Nurk, S., Bankevich, A., Antipov, D., Gurevich, A.A., Korobeynikov, A., Lapidus, A., Prjibelski, A.D., Pyshkin, A., Sirotkin, A., Sirotkin, Y., Stepanauskas, R., Clingenpeel, S.R., Woyke, T., McLean, J.S., Lasken, R., Tesler, G., Alekseyev, M.A., Pevzner, P.A.: Assembling single-cell genomes and mini-metagenomes from chimeric MDA products. J. Comput. Biol. 20(10), 714–737 (2013)

    Article  MathSciNet  Google Scholar 

  19. Onodera, T., Sadakane, K., Shibuya, T.: Detecting superbubbles in assembly graphs. In: Darling, A., Stoye, J. (eds.) WABI 2013. LNCS, vol. 8126, pp. 338–348. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  20. Pevzner, P.A., Tang, H., Waterman, M.S.: An Eulerian path approach to DNA fragment assembly. Proc. Nat. Acad. Sci. U.S.A. 98(17), 9748–9753 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  21. Rahman, M.S., Iliopoulos, C.S.: Pattern matching algorithms with don’t cares. In: van Leeuwen, J., Italiano, G.F., van der Hoek, W., Meinel, C., Sack, H., Plasil, F., Bielikova, M. (eds.) Proceedings of the 33rd International Conference on Current Trends in Theory and Practice of Computer Science (SOFSEM 2007), pp. 116–126. Institute of Computer Science AS CR, Prague (2007)

    Google Scholar 

  22. Régnier, M.: A unified approach to word statistics. In: Proceedings of the Second Annual International Conference on Computational Molecular Biology, RECOMB 1998, pp. 207–213. ACM, New York (1998). http://acm.org/10.1145/279069.279116

  23. Sung, W., Sadakane, K., Shibuya, T., Belorkar, A., Pyrogova, I.: An \(O(m \log m)\)-time algorithm for detecting superbubbles. IEEE/ACM Trans. Comput. Biology Bioinform. 12(4), 770–777 (2015)

    Article  Google Scholar 

  24. Venter, J.C., Adams, M.D., Myers, E.W., Li, P.W., Mural, R.J., Sutton, G.G., Smith, H.O., Yandell, M., Evans, C.A., Holt, R.A., et al.: The sequence of the human genome. Science 291(5507), 1304–1351 (2001)

    Article  Google Scholar 

  25. Zerbino, D.R., Birney, E.: Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18(5), 821–829 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Costas S. Iliopoulos .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Iliopoulos, C.S., Kundu, R., Mohamed, M., Vayani, F. (2016). Popping Superbubbles and Discovering Clumps: Recent Developments in Biological Sequence Analysis. In: Kaykobad, M., Petreschi, R. (eds) WALCOM: Algorithms and Computation. WALCOM 2016. Lecture Notes in Computer Science(), vol 9627. Springer, Cham. https://doi.org/10.1007/978-3-319-30139-6_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-30139-6_1

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-30138-9

  • Online ISBN: 978-3-319-30139-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics