Skip to main content

On a Fixed Haplotype Variant of the Minimum Error Correction Problem

  • Conference paper
  • First Online:
  • 1364 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10976))

Abstract

Haplotype assembly is the problem of reconstructing the two parental chromosomes of an individual from a set of sampled DNA-sequences. A combinatorial optimization problem that models haplotype assembly is the Minimum Error Correction problem (MEC). This problem has been intensively studied in the computational biology literature and is also known in the clustering literature: essentially we are required to find two cluster centres such that the sum of distances to the nearest centre, is minimized. We introduce here the problem Fixed haplotype-Minimum Error Correction (FH-MEC), a new variant of MEC which corresponds to instances where one of the haplotypes/centres is already given. We provide hardness results for the problem on various restricted instances. We also propose a new and very simple 2-approximation algorithm for MEC on binary input matrices.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    A binary matrix M is conflict-free if the rows of M can be bipartitioned into two sets such that the corresponding entries on each set are identical.

References

  1. Alimonti, P., Kann, V.: Hardness of approximating problems on cubic graphs. In: Bongiovanni, G., Bovet, D.P., Di Battista, G. (eds.) CIAC 1997. LNCS, vol. 1203, pp. 288–298. Springer, Heidelberg (1997). https://doi.org/10.1007/3-540-62592-5_80

    Chapter  Google Scholar 

  2. Bansal, V., Bafna, V.: HapCUT: an efficient and accurate algorithm for the haplotype assembly problem. Bioinformatics 24(16), i153–i159 (2008)

    Article  Google Scholar 

  3. Bonizzoni, P., Dondi, R., Klau, G.W., Pirola, Y., Pisanti, N., Zaccaria, S.: On the minimum error correction problem for haplotype assembly in diploid and polyploid genomes. J. Comput. Biol. 23(9), 718–736 (2016)

    Article  MathSciNet  Google Scholar 

  4. Cilibrasi, R., Van Iersel, L., Kelk, S., Tromp, J.: The complexity of the single individual SNP haplotyping problem. Algorithmica 49(1), 13–36 (2007)

    Article  MathSciNet  Google Scholar 

  5. International HapMap Consortium, et al.: A haplotype map of the human genome. Nature 437(7063), 1299 (2005)

    Google Scholar 

  6. Downey, R.G., Fellows, M.R.: Fundamentals of Parameterized Complexity, vol. 201. Springer, London (2016). https://doi.org/10.1007/978-1-4471-5559-1

    Book  MATH  Google Scholar 

  7. Etemadi, M., Bagherian, M., Chen, Z.-Z., Wang, L.: Better ILP models for haplotype assembly. BMC Bioinform. 19(1), 52 (2018)

    Article  Google Scholar 

  8. Feige, U.: NP-hardness of hypercube 2-segmentation (2014). arXiv preprint arXiv:1411.0821

  9. Jiao, Y., Xu, J., Li, M.: On the k-closest substring and k-consensus pattern problems. In: Sahinalp, S.C., Muthukrishnan, S., Dogrusoz, U. (eds.) CPM 2004. LNCS, vol. 3109, pp. 130–144. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-27801-6_10

    Chapter  Google Scholar 

  10. Lancia, G., Bafna, V., Istrail, S., Lippert, R., Schwartz, R.: SNPs problems, complexity, and algorithms. In: auf der Heide, F.M. (ed.) ESA 2001. LNCS, vol. 2161, pp. 182–193. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44676-1_15

    Chapter  MATH  Google Scholar 

  11. Lippert, R., Schwartz, R., Lancia, G., Istrail, S.: Algorithmic strategies for the single nucleotide polymorphism haplotype assembly problem. Brief. Bioinform. 3(1), 23–31 (2002)

    Article  Google Scholar 

  12. Ostrovsky, R., Rabani, Y.: Polynomial-time approximation schemes for geometric min-sum median clustering. J. ACM 49(2), 139–156 (2002)

    Article  MathSciNet  Google Scholar 

  13. Papadimitriou, C.H., Yannakakis, M.: Optimization, approximation, and complexity classes. J. Comput. Syst. Sci. 43(3), 425–440 (1991)

    Article  MathSciNet  Google Scholar 

  14. Phelps, K.T., Rifa, J., Villanueva, M.: Rank and kernel of binary hadamard codes. IEEE Trans. Inf. Theory 51(11), 3931–3937 (2005)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The last author acknowledges the support of an NWO TOP 2 grant.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Georgios Stamoulis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Goblet, A., Kelk, S., Mihalák, M., Stamoulis, G. (2018). On a Fixed Haplotype Variant of the Minimum Error Correction Problem. In: Wang, L., Zhu, D. (eds) Computing and Combinatorics. COCOON 2018. Lecture Notes in Computer Science(), vol 10976. Springer, Cham. https://doi.org/10.1007/978-3-319-94776-1_46

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-94776-1_46

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-94775-4

  • Online ISBN: 978-3-319-94776-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics