Skip to main content

Estimating Population Size via Line Graph Reconstruction

  • Conference paper
Algorithms in Bioinformatics (WABI 2012)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 7534))

Included in the following conference series:

  • 2158 Accesses

Abstract

We propose a novel graph theoretic method to estimate haplotype population size from genotype data. The method considers only the potential sharing of haplotypes between individuals and is based on transforming the graph of potential haplotype sharing into a line graph using a minimum number of edge and vertex deletions. We show that the problems are NP complete and provide exact integer programming solutions for them. We test our approach using extensive simulations of multiple population evolution and genotypes sampling scenarios. Our computational experiments show that when most of the sharings are true sharings the problem can be solved very fast and the estimated size is very close to the true size; when many of the potential sharings do not stem from true haplotype sharing, our method gives reasonable lower bounds on the underlying number of haplotypes. In comparison, a naive approach of phasing the input genotypes provides trivial upper bounds of twice the number of genotypes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Browning, B.L., Browning, S.R.: A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. American Journal of Human Genetics 84(2), 210–223 (2009)

    Article  Google Scholar 

  2. Cai, L.: Fixed-parameter tractability of graph modification problems for hereditary properties. Information Processing Letters 58, 171–176 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  3. Campelo, M., Campos, V., Correa, R.: On the asymmetric representatives formulation for the vertex coloring problem. Discrete Applied Mathematics 156(7), 1097–1111 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  4. Catanzaro, D., Godi, A., Labbé, M.: A class representative model for pure parsimony haplotyping. Informs Journal of Computing 22(2), 195–209 (2009)

    Article  Google Scholar 

  5. Clark, A.: Inference of haplotypes from PCR-amplified samples of diploid populations. Molecular Biology and Evolution 7, 111–122 (1990)

    Google Scholar 

  6. Even, S., Bar-Yehuda, R.: A linear-time approximation algorithm for the weighted vertex cover problem. Journal of Algorithms 2(2), 198–203 (1981)

    Article  MathSciNet  MATH  Google Scholar 

  7. Halldórsson, B.V., Aguiar, D., Tarpine, R., Istrail, S.: The Clark Phaseable Sample Size Problem: Long-Range Phasing and Loss of Heterozygosity in GWAS. Journal of Computational Biology 18(3), 323–333 (2011)

    Article  MathSciNet  Google Scholar 

  8. Halldórsson, B.V., Bafna, V., Edwards, N., Lippert, R., Yooseph, S., Istrail, S.: A Survey of Computational Methods for Determining Haplotypes. In: Istrail, S., Waterman, M.S., Clark, A. (eds.) DIMACS/RECOMB Satellite Workshop 2002. LNCS (LNBI), vol. 2983, pp. 26–47. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  9. Hudson, R.R.: Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics 18(2), 337–338 (2002)

    Article  Google Scholar 

  10. Lehot, P.G.H.: An optimal algorithm to detect a line graph and output its root graph. J. ACM 21, 569–575 (1974)

    Article  MathSciNet  MATH  Google Scholar 

  11. Niedermeier, R., Rossmanith, P.: An efficient fixed-parameter algorithm for 3-hitting set. Journal of Discrete Algorithms 1(1), 89–102 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  12. Prabhu, S., Pe’er, I.: Overlapping pools for high-throughput targeted resequencing. Genome Research 19, 1254–1261 (2009)

    Article  Google Scholar 

  13. Roussopoulos, N.: A max(m, n) algorithm for determining the graph H from its line graph G. Information Processing Letters 2, 108–112 (1974)

    Article  MathSciNet  Google Scholar 

  14. Trevisan, L.: Non-approximability results for optimization problems on bounded degree instances. In: Proceedings of the Thirty-Third Annual ACM Symposium on Theory of Computing, pp. 453–461. ACM (2001)

    Google Scholar 

  15. Van Rooij, A., Wilf, H.: The interchange graphs of a finite graph. Acta Math. Acad. Sci. Hungar. 16, 263–269 (1965)

    Article  MathSciNet  MATH  Google Scholar 

  16. Whitney, H.: Congruent graphs and the connectivity of graphs. American Journal of Mathematics 54, 150–162 (1932)

    Article  MathSciNet  Google Scholar 

  17. Yannakakis, M.: Node-and edge-deletion NP-complete problems. In: Proceedings of the Tenth Annual ACM Symposium on Theory of Computing, STOC 1978, pp. 253–264. ACM, New York (1978)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Halldórsson, B.V., Blokh, D., Sharan, R. (2012). Estimating Population Size via Line Graph Reconstruction. In: Raphael, B., Tang, J. (eds) Algorithms in Bioinformatics. WABI 2012. Lecture Notes in Computer Science(), vol 7534. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33122-0_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-33122-0_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-33121-3

  • Online ISBN: 978-3-642-33122-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics