Skip to main content

A Practical Parameterized Algorithm for the Individual Haplotyping Problem MLF

  • Conference paper
Theory and Applications of Models of Computation (TAMC 2008)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4978))

  • 5868 Accesses

Abstract

The individual haplotyping problem Minimum Letter Flip (MLF) is a computational problem that, given a set of aligned DNA sequence fragment data of an individual, induces the corresponding haplotypes by flipping minimum SNPs. There has been no practical exact algorithm to solve the problem. In DNA sequencing experiments, due to technical limits, the maximum length of a fragment sequenced directly is about 1kb. In consequence, with a genome-average SNP density of 1.84 SNPs per 1 kb of DNA sequence, the maximum number k 1 of SNP sites that a fragment covers is usually small. Moreover, in order to save time and money, the maximum number k 2 of fragments that cover a SNP site is usually no more than 19. Based on the properties of fragment data, the current paper introduces a new parameterized algorithm of running time \(O(nk_22^{k_2}+mlogm+mk_1)\), where m is the number of fragments, n is the number of SNP sites. The algorithm solves the MLF problem efficiently even if m and n are large, and is more practical in real biological applications.

This research was supported in part by the National Natural Science Foundation of China under Grant Nos. 60433020 and 60773111, the Program for New Century Excellent Talents in University No. NCET-05-0683, the Program for Changjiang Scholars and Innovative Research Team in University No. IRT0661, and the Scientific Research Fund of Hunan Provincial Education Department under Grant No.06C526.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Venter, J.C., Adams, M.D., Myers, E.W., et al.: The sequence of the human genome. Science 291(5507), 1304–1351 (2001)

    Article  Google Scholar 

  2. The International HapMap Consortium: A haplotype map of the human genome. Nature 437(7063) 1299–1320 (2005)

    Google Scholar 

  3. Gabriel, S.B., Schaffner, S.F., Nguyen, H., et al.: The structure of haplotype blocks in the human genome. Science 296(5576), 2225–2229 (2002)

    Article  Google Scholar 

  4. Stephens, J.C., Schneider, J.A., Tanguay, D.A., et al.: Haplotype variation and linkage disequilibrium in 313 human genes. Science 293(5529), 489–493 (2001)

    Article  Google Scholar 

  5. Horikawa, Y., Oda, N., Cox, N.J., et al.: Genetic variation in the gene encoding calpain-10 is associated with type 2 diabetes mellitus. Nature Genetics 26(2), 163–175 (2000)

    Article  Google Scholar 

  6. Greenberg, H.J., Hart, W.E., Lancia, G.: Opportunities for combinatorial optimization in computational biology. INFORMS J. Comput. 16(3), 211–231 (2004)

    Article  MathSciNet  Google Scholar 

  7. Zhao, Y.Y., Wu, L.Y., Zhang, J.H., Wang, R.S., Zhang, X.S.: Haplotype assembly from aligned weighted snp fragments. Computational Biology and Chemistry 29(4), 281–287 (2005)

    Article  MATH  Google Scholar 

  8. Lancia, G., Bafna, V., Istrail, S., Lippert, R., Schwartz, R.: Snps problems, complexity and algorithms. In: Meyer auf der Heide, F. (ed.) ESA 2001. LNCS, vol. 2161, pp. 182–193. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  9. Lippert, R., Schwartz, R., Lancia, G., Istrail, S.: Algorithmic strategies for the single nucleotide polymorphism haplotype assembly problem. Brief. Bioinform 3(1), 1–9 (2002)

    Article  Google Scholar 

  10. Wang, R.S., Wu, L.Y., Li, Z.P., Zhang, X.S.: Haplotype reconstruction from snp fragments by minimum error correction. Bioinformatics 21(10), 2456–2462 (2005)

    Article  Google Scholar 

  11. Bonizzoni, P., Vedova, G.D., Dondi, R., Li, J.: The haplotyping problem: an overview of computational models and solutions. J. Comp. Sci. Technol. 18(6), 675–688 (2003)

    Article  MATH  Google Scholar 

  12. Chen, C., Wang, J., Cohen, B.: The strength of selection on ultraconserved elements in the human genome. The American Journal of Human Genetics 80(4), 692–704 (2007)

    Article  Google Scholar 

  13. Huson, D.H., Halpern, A.L., Lai, Z., Myers, E.W., Reinert, K., Sutton, G.G.: Comparing assemblies using fragments and mate-pairs. In: Gascuel, O., Moret, B.M.E. (eds.) WABI 2001. LNCS, vol. 2149, pp. 294–306. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  14. International Human Genome Sequencing Consortium: Initial sequencing and analysis of the human genome. Nature 409(6822), 860–921 (2001)

    Google Scholar 

  15. Wernicke, S.: On the algorithmic tractability of single nucleotide polymorphism (SNP) analysis and related problems. Ph. d. thesis, Univ. Tübingen (2003)

    Google Scholar 

  16. Sanger, F., Nicklen, S., Coulson, A.R.: Dna sequencing with chain-terminating inhibitors. PNAS 74(12), 5463–5467 (1977)

    Article  Google Scholar 

  17. Levy, S., Sutton, G., Ng, P.C., et al.: The diploid genome sequence of an individual human. PLoS Biology 5(10), October 2007, e254–e254 (2007)

    Google Scholar 

  18. Hinds, D.A., Stuve, L.L., Nilsen, G.B., Halperin, E., Eskin, E., Ballinger, D.B., Frazer, K.A., Cox, D.R.: Whole-genome patterns of common dna variation in three human populations. Science 307(5712), 1072–1079 (2005)

    Article  Google Scholar 

  19. Hüffner, F.: Algorithm engineering for optimal graph bipartization. In: Nikoletseas, S.E. (ed.) WEA 2005. LNCS, vol. 3503, pp. 240–252. Springer, Heidelberg (2005)

    Google Scholar 

  20. Panconesi, A., Sozio, M.: Fast hare: a fast heuristic for single individual snp haplotype reconstruction. In: Jonassen, I., Kim, J. (eds.) WABI 2004. LNCS (LNBI), vol. 3240, pp. 266–277. Springer, Heidelberg (2004)

    Google Scholar 

  21. Myers, G.: A dataset generator for whole genome shotgun sequencing. In: Lengauer, T., Schneider, R., Bork, P., Brutlag, D.L., Glasgow, J.I., Mewes, H.W., Zimmer, R. (eds.) Proc. ISMB, California, pp. 202–210. AAAI Press, Menlo Park (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Manindra Agrawal Dingzhu Du Zhenhua Duan Angsheng Li

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Xie, M., Wang, J., Chen, J. (2008). A Practical Parameterized Algorithm for the Individual Haplotyping Problem MLF. In: Agrawal, M., Du, D., Duan, Z., Li, A. (eds) Theory and Applications of Models of Computation. TAMC 2008. Lecture Notes in Computer Science, vol 4978. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-79228-4_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-79228-4_38

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-79227-7

  • Online ISBN: 978-3-540-79228-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics