Skip to main content

Insights on Haplotype Inference on Large Genotype Datasets

  • Conference paper
Advances in Bioinformatics and Computational Biology (BSB 2010)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 6268))

Included in the following conference series:

Abstract

In this paper we present insights on the problem of haplotype inference for large genotype datasets. Our observations are drawn from an extensive comparison of three methods for haplotype inference using several datasets taken from HapMap. The methods chosen, PTG, Haplorec, and fastPHASE, are among the best known; they are based on different approaches, and are able to deal with large amounts of data. Our analysis controls the execution time and also the accuracy of results, based on the Error Rate and the Switch Error, as well as sequence conservation patterns. The results show that (1) fastPHASE and Haplorec are both more accurate than PTG, (2) fastPHASE is computationally the most expensive of the three methods, while Haplorec may fail to resolve long sequences, and (3) all approaches do better with more conserved sequences, and tend to fail in distinct sequence sites.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adkins, R.M.: Comparison of the accuracy of methods of computational haplotype inference using a large empirical dataset. BMC Genetics, 5–22 (2004)

    Google Scholar 

  2. Xu, H., Wu, X., Spitz, M.R., Shete, S.: Comparison of haplotype inference methods using a genotypic data from unrelated individuals. International Journal of Human and Medical Genetics 58, 63–68 (2004)

    Google Scholar 

  3. Scheet, P., Stephens, M.: A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. American Journal of Human Genetics 78, 629–644 (2006)

    Article  Google Scholar 

  4. Eronen, L., Geerts, F., Toivonen, H.: Haplorec: Efficient and accurate largescale reconstruction of haplotypes. BMC Bioinformatics 7, 542 (2006)

    Article  Google Scholar 

  5. Li, Z., Zhou, W., Zhang, X.S., Chen, L.: A parsimonious tree-grow method for haplotype inference. Oxford Bioinformatics 17, 3475–3481 (2005)

    Article  Google Scholar 

  6. Clark, A.: Inference of haplotypes from PCRamplified samples of diploid populations. Journal of Molecular Biology and Evolution 7, 111–122 (1990)

    Google Scholar 

  7. Gusfield, D.: Inference of Haplotypes from samples of diploids populations: Complexity and algorithms. Journal of Computational Biology 8, 305–323 (2001)

    Article  Google Scholar 

  8. Gusfield, D.: Haplotype Inference by Pure Parsimony. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 144–155. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  9. Lancia, G., Pinotti, C.M., Rizzi, R.: Haplotype Haplotyping Populations by Pure Parsimony: Complexity of Exact and Approximation Algorithms. INFORMS J. Computing 16, 348–359 (2004)

    Article  MathSciNet  Google Scholar 

  10. Halldrsson, B.V., Bafna, V., Edwards, N., Lippert, R., Yooseph, S., Istrail, S.: A Survey of Computational Methods for Determining Haplotypes. In: Istrail, S., Waterman, M.S., Clark, A. (eds.) DIMACS/RECOMB Satellite Workshop 2002. LNCS (LNBI), vol. 2983, pp. 26–47. Springer, Heidelberg (2004)

    Google Scholar 

  11. Brown, D.G., Harrower, I.M.: Haplotyping as perfect phylogeny: Conceptual framework and efficient solutions. IEEE/ACM Trans. Comput. Biol. Bioinform. 3, 141–154 (2006)

    Article  Google Scholar 

  12. Gusfield, D.: Haplotyping as perfect phylogeny: Conceptual framework and efficient solutions. In: RECOMB, pp. 166–175 (2002)

    Google Scholar 

  13. Gusfield, D.Z., Filkov, V.: A Linear-Time Algorithm for the Perfect Phylogeny Haplotyping (PPH) Problem. Journal of Computational Biology 13, 522–553 (2006)

    Article  MathSciNet  Google Scholar 

  14. Stephens, M., Smith, N.J., Donnelly, P.: A new statistical method for haplotype reconstruction from population data. American Journal of Human Genetics 68, 978–989 (2001)

    Article  Google Scholar 

  15. Niu, T., Qin, Z.S., Xu, X., Liu, J.S.: Bayesian haplotype inference for multiple linked singlenucleotide polymorphism. American Journal of Human Genetics 70, 157–169 (2002)

    Article  Google Scholar 

  16. Sun, S., Greenwood, C.M.T., Neal, R.M.: Haplotype inference using a Bayesian hidden Markov model. Genetic Epidemiology 31, 937–948 (2007)

    Article  Google Scholar 

  17. Wu, L.Y., Zang, J.H., Chan, R.: Improved approach for haplotype inference based on Markov chain. Lecture Notes in Operations Research 9, 204–215 (2008)

    Google Scholar 

  18. Wang, R.S., Zhang, X.S., Sheng, L.: Haplotype inference by pure parsimony via genetic algorithm. Lecture Notes in Operations Research 5, 308–318 (2005)

    Google Scholar 

  19. Che, D., Tang, H., Song, Y.: Haplotype inference using a genetic algorithm. In: CICB, pp. 31–37 (2009)

    Google Scholar 

  20. Eronen, L., Geerts, F., Toivonen, H.: A markov chain approach to reconstruction of long haplotypes. In: Pac. Symp. Biocomput, pp.104–115 (2004)

    Google Scholar 

  21. Zhang, J.H., Wu, L.Y., Chen, J., Zhang, X.S.: A fast haplotype inference method for large population genotype data. Computational Statistics & Data Analysis 52, 4891–4902 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  22. Stephens, M., Donnelly, P.: A comparison of Bayesian methods for haplotype reconstruction from population genotype data. American Journal of Human Genetics 73, 1162–1169 (2003)

    Article  Google Scholar 

  23. The International HapMap Consortium: The International HapMap Consortium. Nature 426, 789–796 (2003)

    Google Scholar 

  24. Lin, S., Cutler, D.J., Zwick, M.E., Chakravarti, A.: Haplotype Inference in Random Population Samples. American Journal of Human Genetics 71, 1129–1137 (2002)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rosa, R.S., Guimarães, K.S. (2010). Insights on Haplotype Inference on Large Genotype Datasets. In: Ferreira, C.E., Miyano, S., Stadler, P.F. (eds) Advances in Bioinformatics and Computational Biology. BSB 2010. Lecture Notes in Computer Science(), vol 6268. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15060-9_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15060-9_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15059-3

  • Online ISBN: 978-3-642-15060-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics