Skip to main content

Detecting Superbubbles in Assembly Graphs

  • Conference paper
Book cover Algorithms in Bioinformatics (WABI 2013)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 8126))

Included in the following conference series:

Abstract

We introduce a new concept of a subgraph class called a superbubble for analyzing assembly graphs, and propose an efficient algorithm for detecting it. Most assembly algorithms utilize assembly graphs like the de Bruijn graph or the overlap graph constructed from reads. From these graphs, many assembly algorithms first detect simple local graph structures (motifs), such as tips and bubbles, mainly to find sequencing errors. These motifs are easy to detect, but they are sometimes too simple to deal with more complex errors. The superbubble is an extension of the bubble, which is also important for analyzing assembly graphs. Though superbubbles are much more complex than ordinary bubbles, we show that they can be efficiently enumerated. We propose an average-case linear time algorithm (i.e., O(n + m) for a graph with n vertices and m edges) for graphs with a reasonable model, though the worst-case time complexity of our algorithm is quadratic (i.e., O(n(n + m))). Moreover, the algorithm is practically very fast: Our experiments show that our algorithm runs in reasonable time with a single CPU core even against a very large graph of a whole human genome.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Batzoglou, S., Jaffe, D.B., Stanley, K., Butler, J., Gnerre, S., Mauceli, E., Berger, B., Mesirov, J.P., Lander, E.S.: Arachne: a whole-genome shotgun assembler. Genome Research 12, 177–189 (2002)

    Article  Google Scholar 

  2. Bowe, A., Onodera, T., Sadakane, K., Shibuya, T.: Succinct de bruijn graphs. In: Raphael, B., Tang, J. (eds.) WABI 2012. LNCS, vol. 7534, pp. 225–235. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  3. Huang, X., Yang, S.P.: Generating a genome assembly with pcap. Current Protocols in Bioinformatics, Unit 11.3 (2005)

    Google Scholar 

  4. Jackson, B., Regennitter, M., Yang, X., Schnable, P.S., Aluru, S.: Parallel de novo assembly of large genomes from high-throughput short reads. In: Proc. 24th International Parallel and Distributed Processing Symposium (IPDPS), pp. 1–10 (2010)

    Google Scholar 

  5. Kasahara, M., Morishita, S.: Large-Scale Genome Sequence Processing. Imperial College Press (2006)

    Google Scholar 

  6. Li, R., Zhu, H., Ruan, J., Qjan, W., Fang, X., Shi, Z., Li, Y., Li, S., Shan, G., Kristiansen, K., Yang, H., Wang, J.: De novo assembly of human genomes with massively parallel short read sequencing. Genome Research 20, 265–272 (2010)

    Article  Google Scholar 

  7. Lyons, R., Peres, Y.: Probability on Trees and Networks. Cambridge University Press (2012) (in preparation), Current version available at http://mypage.iu.edu/string~rdlyons/

  8. MacCallum, I., Przybylski, D., Gnerre, S., Burton, J., Shlyakhter, I., Gnirke, A., Malek, J., McKernan, K., Ranade, S., Shea, T.P., Williams, L., Young, S., Nusbaum, C., Jaffe, D.B.: Allpaths 2: small genomes assembled accurately and with high continuity from short paired reads. Genome Biology 10(R103) (2009)

    Google Scholar 

  9. Miller, J.R., Koren, S., Sutton, G.: Assembly algorithms for next-generation sequencing data. Genomics 95, 315–327 (2010)

    Article  Google Scholar 

  10. Myers, E.W.: Toward simplifying and accurately formulating fragment assembly. Journal of Comutational Biology 2, 275–290 (1995)

    Article  Google Scholar 

  11. Myers, E.W., Sutton, G.G., Delcher, A.L., Dew, I.M., Fasulo, D.P., Flanigan, M.J., Kravitz, S.A., Mobarry, C.M., Reinert, K.H.J., Remington, K.A., Anson, E.L., Bolanos, R.A., Chou, H., Jordan, C.M., Halpern, A.L., Lonardi, S., Beasley, E.M., Brandon, R.C., Chen, L., Dunn, P.J., Lai, Z., Liang, Y., Nusskern, D.R., Zhan, M., Zhang, Q., Zheng, X., Rubin, G.M., Adams, M.D., Venter, J.C.: A whole-genome assembly of drosophila. Science 287, 2196–2204 (2000)

    Article  Google Scholar 

  12. Nurk, S., et al.: Assembling genomes and mini-metagenomes from highly chimeric reads. In: Deng, M., Jiang, R., Sun, F., Zhang, X. (eds.) RECOMB 2013. LNCS, vol. 7821, pp. 158–170. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  13. Pevzner, P.A., Tang, H., Waterman, M.S.: An eulerian path approach to dna fragment assembly. Proceedings of the National Academy of Sciences 98, 9748–9753 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  14. Pop, M.: Genome assembly reborn: recent computational challenges. Briefings in Bioinformatics 10(4), 354–366 (2009)

    Article  Google Scholar 

  15. Sahli, M., Shibuya, T.: Arapan-s: a fast and highly accurate whole-genome assembly software for viruses and small genomes. BMC Research Notes 5(243) (2012)

    Google Scholar 

  16. Simpson, J.T., Wong, K., Jackman, S.D., Schein, J.E., Jones, S.J.: Abyss: a parallel assembler for short read sequence data. Genome Research 19, 1117–1123 (2009)

    Article  Google Scholar 

  17. Zerbino, D.R., Birney, E.: Velvet: algorithms for de novo short read assembly using de bruijn graphs. Genome Research 18, 821–829 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Onodera, T., Sadakane, K., Shibuya, T. (2013). Detecting Superbubbles in Assembly Graphs. In: Darling, A., Stoye, J. (eds) Algorithms in Bioinformatics. WABI 2013. Lecture Notes in Computer Science(), vol 8126. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40453-5_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-40453-5_26

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-40452-8

  • Online ISBN: 978-3-642-40453-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics