Methods for Unobservable Phase

  • Andrea S. Foulkes
Part of the Use R book series (USE R)


One of the primary analytic challenges in population-based genetic investigations of unrelated individuals is the unobservable nature of allelic phase. We introduced this concept briey in Section 2.3.2, and here we elaborate on the statistical challenges and analytic techniques for characterizing haplotype associations in the context of unknown phase. Recall that haplotypic phase refers to the specific alignment of alleles on a single homologous chromosome and is generally not observable in the context of population-based investigations of unrelated individuals. Since the SNPs under study are often markers for the true disease-causing variant, haplotypes may capture more variability in the disease trait than genotype alone. Several statistical approaches to inferring haplotypic phase have been proposed. The goals of these methods are generally twofold. On the one hand, interest lies in estimating populationlevel haplotype frequencies; that is, the prevalence of specific haplotypes in the general population. Investigators are also interested in making inference about the association between haplotypes and a trait. This chapter addresses both aims and is divided into two sections. The First (Section 5.1) focuses on methods for estimation of haplotype frequencies that do not involve knowledge about a trait or disease phenotype. The second (Section 5.2) focuses on methods that involve both estimation of haplotype frequencies and testing for association between these haplotypes and a measured trait.

In the First section, two methods are described: (1) an expectationmaximization (EM) algorithm and (2) a Bayesian haplotype reconstruction approach. Both approaches draw solely on genotype information to arrive at haplotype estimates and do not incorporate knowledge about the trait under investigation. Further details on these approaches can be found in Excoffier and Slatkin (1995), Hawley and Kidd (1995), Long et al. (1995), Stephens and Donnelly (2000), Stephens et al. (2001) and Stephens and Donnelly (2003). Both methods can be used to infer individual-level haplotypes and in turn make inference about haplotype{trait associations; however, this must proceed with careful consideration of the additional variability introduced due to the uncertainty in the estimated data. This is described in Section 5.2.


Haplotype Frequency Trait Association Haplotype Pair Haplotype Estimation Complete Data Likelihood 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag New York 2009

Authors and Affiliations

  1. 1.University of MassachusettsSchool of Public Health & Health SciencesAmherstUSA

Personalised recommendations