Abstract
Solutions to genome scaffolding problems can be represented as paths and cycles in a “solution graph”. However, when working with repetitions, such solution graph may contain branchings and they may not be uniquely convertible into sequences. Having introduced, in a previous work, various ways of extracting the unique parts of such solutions, we extend previously known NP-hardness results to the case that the solution graph is planar, bipartite, and subcubic, and show the APX-completeness in this case. We also provide some practical tests.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Berman, P., Karpinski, M., Scott, A.D.: Approximation hardness and satisfiability of bounded occurrence instances of SAT. Electronic Colloquium on Computational Complexity (ECCC), 10(022) (2003)
Biscotti, M.A., Olmo, E., Heslop-Harrison, J.S.: Repetitive DNA in eukaryotic genomes. Chromosome Res. 23(3), 415–420 (2015)
Cameron, D.L., et al.: GRIDSS: sensitive and specific genomic rearrangement detection using positional de Bruijn graph assembly. Genome Res. 27(12), 2050–2060 (2017)
Chateau, A., Giroudeau, R.: A complexity and approximation framework for the maximization scaffolding problem. Theor. Comput. Sci. 595, 92–106 (2015). https://doi.org/10.1016/j.tcs.2015.06.023
Chikhi, R., Rizk, G.: Space-efficient and exact de Bruijn graph representation based on a bloom filter. In: Raphael, B., Tang, J. (eds.) WABI 2012. LNCS, vol. 7534, pp. 236–248. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33122-0_19
Ekblom, R., Wolf, J.B.: A field guide to whole-genome sequencing, assembly and annotation. Evol. Appl. 7(9), 1026–1042 (2014)
Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman & Co., New York (1979)
Håstad, J.: Some optimal inapproximability results. J. ACM 48(4), 798–859 (2001)
Koch, P., Platzer, M., Downie, B.R.: RepARK-de novo creation of repeat libraries from whole-genome NGS reads. Nucleic Acids Res. 42(9), e80 (2014)
Li, H., Durbin, R.: Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26(5), 589–595 (2010)
Li, H., et al.: The sequence alignment/map format and samtools. Bioinformatics 25(16), 2078–2079 (2009)
Lichtenstein, D.: Planar formulae and their uses. SIAM J. Comput. 11(2), 329–343 (1982)
Lokshtanov, D., Marx, D., Saurabh, S.: Lower bounds based on the exponential time hypothesis. Bull. EATCS 105, 41–72 (2011)
Mandric, I., Lindsay, J., Măndoiu, I.I., Zelikovsky, A.: Scaffolding algorithms. In: Măndoiu, I., Zelikovsky, A. (eds.) Computational Methods for Next Generation Sequencing Data Analysis, pp. 107–132. Wiley (2016). Chapter 5
Morgulis, A., Coulouris, G., Raytselis, Y., Madden, T.L., Agarwala, R., Schäffer, A.A.: Database indexing for production megablast searches. Bioinformatics 24(16), 1757–1764 (2008). https://doi.org/10.1093/bioinformatics/btn322
Papadimitriou, C.H., Yannakakis, M.: Optimization, approximation, and complexity classes. J. Comput. Syst. Sci. 43(3), 425–440 (1991)
Quail, M.A.: A tale of three next generation sequencing platforms: comparison of ion torrent, pacific biosciences and illumina miseq sequencers. BMC Genomics 13(1), 341 (2012)
Tang, H.: Genome assembly, rearrangement, and repeats. Chem. Rev. 107(8), 3391–3406 (2007)
Trevisan, L.: Non-approximability results for optimization problems on bounded degree instances. In: Proceedings on 33rd Annual ACM Symposium on Theory of Computing, 6–8 July 2001, Heraklion, Crete, Greece, pp. 453–461 (2001)
Weller, M., Chateau, A., Giroudeau, R.: Exact approaches for scaffolding. BMC Bioinf. 16(Suppl 14), S2 (2015)
Weller, M., Chateau, A., Giroudeau, R.: On the linearization of scaffolds sharing repeated contigs. In: Gao, X., Du, H., Han, M. (eds.) COCOA 2017. LNCS, vol. 10628, pp. 509–517. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71147-8_38
Weller, M., Chateau, A., Dallard, C., Giroudeau, R.: Scaffolding problems revisited: complexity, approximation and fixed parameter tractable algorithms, and some special cases. Algorithmica 80(6), 1771–1803 (2018)
Weller, M., Chateau, A., Giroudeau, R., Poss, M.: Scaffolding with repeated contigs using flow formulations (2018)
Acknowledgments
This work was supported by the Institut de Biologie Computationnelle (http://www.ibc-montpellier.fr/) (ANR Projet Investissements d’Avenir en bioinformatique IBC).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Davot, T., Chateau, A., Giroudeau, R., Weller, M. (2018). On the Hardness of Approximating Linearization of Scaffolds Sharing Repeated Contigs. In: Blanchette, M., Ouangraoua, A. (eds) Comparative Genomics. RECOMB-CG 2018. Lecture Notes in Computer Science(), vol 11183. Springer, Cham. https://doi.org/10.1007/978-3-030-00834-5_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-00834-5_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00833-8
Online ISBN: 978-3-030-00834-5
eBook Packages: Computer ScienceComputer Science (R0)