A New Integer Programming Formulation for the Pure Parsimony Problem in Haplotype Analysis

  • Daniel G. Brown
  • Ian M. Harrower
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3240)


We present a new integer programming formulation for the haplotype inference by pure parsimony (HIPP) problem. Unlike a previous approach to this problem [2], we create an integer program whose size is polynomial in the size of the input. This IP is substantially smaller for moderate-sized instances of the HIPP problem. We also show several additional constraints, based on the input, that can be added to the IP to aid in finding a solution, and show how to find which of these constraints is active for a given instance in efficient time. We present experimental results that show our IP has comparable success to the formulation of Gusfield [2] on moderate-sized problems, though it is is much slower. However, our formulation can sometimes solve substantially larger problems than are practical with Gusfield’s formulation.


Integer Program Problem Instance Linear Program Relaxation Unique Haplotype Fractional Solution 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Clark, G.: Inference of haplotypes from PCR-amplified samples of diploid populations. Molecular Biology and Evolution 7(2), 111–112 (1990)Google Scholar
  2. 2.
    Gusfield, D.: Haplotype inference by pure parsimony. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 144–155. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  3. 3.
    Gusfield, D.: Personal communication (June 2004)Google Scholar
  4. 4.
    Halldórsson, B.V., Bafna, V., Edwards, N., Lippert, R., Yooseph, S., Istrail, S.: A survey of computational methods for determining haplotypes. In: Istrail, S., Waterman, M.S., Clark, A. (eds.) DIMACS/RECOMB Satellite Workshop 2002. LNCS (LNBI), vol. 2983, pp. 26–47. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  5. 5.
    Hudson, R.R.: Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics 18(2), 337–338 (2002)CrossRefGoogle Scholar
  6. 6.
    Lancia, G., Pinotti, C.M., Rizzi, R.: Haplotyping populations: Complexity and approximations. Techical report DIT-02-0080, University of Ternto (October 2002)Google Scholar
  7. 7.
    Lancia, G., Pinotti, C.M., Rizzi, R.: Haplotyping populations by pure parsimony: Complexity, exact, and approximation algorithms. INFORMS Journal of Computing (2004) (to appear)Google Scholar
  8. 8.
    Niu, T., Qin, Z.S., Xu, X., Liu, J.S.: Bayesian haplotype inference for multiple linked single-nucleotide polymorphisms. American Journal of Human Genetics 70(1), 157–159 (2002)CrossRefGoogle Scholar
  9. 9.
    Stephens, M., Smith, N.J., Donnelly, P.: A new statistical method for haplotype reconstruction from population data. American Journal of Human Genetics 68(4), 978–979 (2001)CrossRefGoogle Scholar
  10. 10.
    Wang, L., Xu, Y.: Haplotype inference by maximum parsimony. Bioinformatics 19(14), 1773–1780 (2003)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Daniel G. Brown
    • 1
  • Ian M. Harrower
    • 1
  1. 1.School of Computer ScienceUniversity of WaterlooWaterlooCanada

Personalised recommendations