Efficient Haplotype Inference with Pseudo-boolean Optimization

Graça, Ana; Marques-Silva, João; Lynce, Inês; Oliveira, Arlindo L.

doi:10.1007/978-3-540-73433-8_10

Ana Graça¹,
João Marques-Silva²,
Inês Lynce¹ &
…
Arlindo L. Oliveira¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4545))

Included in the following conference series:

International Conference on Algebraic Biology

610 Accesses
18 Citations

Abstract

Haplotype inference from genotype data is a key computational problem in bioinformatics, since retrieving directly haplotype information from DNA samples is not feasible using existing technology. One of the methods for solving this problem uses the pure parsimony criterion, an approach known as Haplotype Inference by Pure Parsimony (HIPP). Initial work in this area was based on a number of different Integer Linear Programming (ILP) models and branch and bound algorithms. Recent work has shown that the utilization of a Boolean Satisfiability (SAT) formulation and state of the art SAT solvers represents the most efficient approach for solving the HIPP problem.

Motivated by the promising results obtained using SAT techniques, this paper investigates the utilization of modern Pseudo-Boolean Optimization (PBO) algorithms for solving the HIPP problem. The paper starts by applying PBO to existing ILP models. The results are promising, and motivate the development of a new PBO model (RPoly) for the HIPP problem, which has a compact representation and eliminates key symmetries. Experimental results indicate that RPoly outperforms the SAT-based approach on most problem instances, being, in general, significantly more efficient.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Brown, D., Harrower, I.: A new integer programming formulation for the pure parsimony problem in haplotype analysis. In: Jonassen, I., Kim, J. (eds.) WABI 2004. LNCS (LNBI), vol. 3240, pp. 254–265. Springer, Heidelberg (2004)
Google Scholar
Brown, D., Harrower, I.: Integer programming approaches to haplotype inference by pure parsimony. IEEE/ACM Transactions on Computational Biology and Bioinformatics 3(2), 141–154 (2006)
Article Google Scholar
Daly, M.J., Rioux, J.D., Schaffner, S.F., Hudson, T.J., Lander, E.S.: High-resolution haplotype structure in the human genome. Nature Genetics 29, 229–232 (2001)
Article Google Scholar
Drysdale, C.M., McGraw, D.W., Stack, C.B., Stephens, J.C., Judson, R.S., Nandabalan, K., Arnold, K., Ruano, G., Liggett, S.B.: Complex promoter and coding region β ₂-adrenergic receptor haplotypes alter receptor expression and predict in vivo responsiveness. In: Proceedings of the National Academy of Sciences of the United States of America 97, pp. 10483–10488 (2000)
Google Scholar
Eén, N., Sörensson, N.: Translating pseudo-Boolean constraints into SAT. Journal on Satisfiability, Boolean Modeling and Computation 2, 1–26 (2006)
MATH Google Scholar
Gusfield, D.: Haplotype inference by pure parsimony. In: Baeza-Yates, R.A., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 144–155. Springer, Heidelberg (2003)
Chapter Google Scholar
Gusfield, D., Orzach, S. (eds.): Handbook on Computational Molecular Biology. Chapman and Hall/CRC Computer and Information Science Series, chapter Haplotype Inference, vol. 9. CRC Press, Boca Raton (2005)
Google Scholar
Halldórsson, B., Bafna, V., Edwards, N., Lippert, R., Yooseph, S., Istrail, S.: A survey of computational methods for determining haplotypes. In: Istrail, S., Waterman, M.S., Clark, A. (eds.) Computational Methods for SNPs and Haplotype Inference. LNCS (LNBI), vol. 2983, pp. 26–47. Springer, Heidelberg (2004)
Google Scholar
Kroetz, D.L., Pauli-Magnus, C., Hodges, L.M., Huang, C.C., Kawamoto, M., Johns, S.J., Stryke, D., Ferrin, T.E., DeYoung, J., Taylor, T., Carlson, E.J., Herskowitz, I., Giacomini, K.M., Clark, A.G.: Sequence diversity and haplotype structure in the human ABCD1 (MDR1, multidrug resistance transporter). Pharmacogenetics 13, 481–494 (2003)
Article Google Scholar
Lancia, G., Pinotti, C.M., Rizzi, R.: Haplotyping populations by pure parsimony: complexity of exact and approximation algorithms. INFORMS Journal on Computing 16(4), 348–359 (2004)
Article MathSciNet Google Scholar
Lynce, I., Marques-Silva, J.: Efficient haplotype inference with Boolean satisfiability. In: National Conference on Artificial Intelligence (AAAI) (July 2006)
Google Scholar
Lynce, I., Marques-Silva, J.: SAT in bioinformatics: Making the case with haplotype inference. In: International Conference on Theory and Applications of Satisfiability Testing (SAT), pp. 136–141 (August 2006)
Google Scholar
Manquinho, V., Roussel, O.: The first evaluation of Pseudo-boolean solvers (PB’05). Journal on Satisfiability, Boolean Modeling and Computation 2, 103–143 (2006)
MATH Google Scholar
Rieder, M.J., Taylor, S.T., Clark, A.G., Nickerson, D.A.: Sequence variation in the human angiotensin converting enzyme. Nature Genetics 22, 59–62 (1999)
Article Google Scholar
Schaffner, S., Foo, C., Gabriel, S., Reich, D., Daly, M., Altshuler, D.: Calibrating a coalescent simulation of human genome sequence variation. Genome Reasearch 15, 1576–1583 (2005)
Article Google Scholar
Stephens, M., Smith, N., Donelly, P.: A new statistical method for haplotype reconstruction. American Journal of Human Genetics 68, 978–989 (2001)
Article Google Scholar
The International HapMap Consortium. A haplotype map of the human genome. Nature, 437, 1299–1320 (2005)
Google Scholar
Wang, L., Xu, Y.: Haplotype inference by maximum parsimony. Bioinformatics 19(14), 1773–1780 (2003)
Article Google Scholar

Download references

Author information

Authors and Affiliations

IST/INESC-ID, Technical University of Lisbon, Portugal
Ana Graça, Inês Lynce & Arlindo L. Oliveira
School of Electronics and Computer Science, University of Southampton, UK
João Marques-Silva

Authors

Ana Graça
View author publications
You can also search for this author in PubMed Google Scholar
João Marques-Silva
View author publications
You can also search for this author in PubMed Google Scholar
Inês Lynce
View author publications
You can also search for this author in PubMed Google Scholar
Arlindo L. Oliveira
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Hirokazu Anai Katsuhisa Horimoto Temur Kutsia

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Graça, A., Marques-Silva, J., Lynce, I., Oliveira, A.L. (2007). Efficient Haplotype Inference with Pseudo-boolean Optimization. In: Anai, H., Horimoto, K., Kutsia, T. (eds) Algebraic Biology. AB 2007. Lecture Notes in Computer Science, vol 4545. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73433-8_10

Download citation

DOI: https://doi.org/10.1007/978-3-540-73433-8_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73432-1
Online ISBN: 978-3-540-73433-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics