An Accurate Method for Inferring Relatedness in Large Datasets of Unphased Genotypes via an Embedded Likelihood-Ratio Test

Rodriguez, Jesse M.; Batzoglou, Serafim; Bercovici, Sivan

doi:10.1007/978-3-642-37195-0_18

Jesse M. Rodriguez^23,24,
Serafim Batzoglou²³ &
Sivan Bercovici²³

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 7821))

Included in the following conference series:

Annual International Conference on Research in Computational Molecular Biology

3199 Accesses
2 Citations

Abstract

Studies that map disease genes rely on accurate annotations that indicate whether individuals in the studied cohorts are related to each other or not. For example, in genome-wide association studies, the cohort members are assumed to be unrelated to one another. Investigators can correct for individuals in a cohort with previously-unknown shared familial descent by detecting genomic segments that are shared between them, which are considered to be identical by descent (IBD). Alternatively, elevated frequencies of IBD segments near a particular locus among affected individuals can be indicative of a disease-associated gene. As genotyping studies grow to use increasingly large sample sizes and meta-analyses begin to include many data sets, accurate and efficient detection of hidden relatedness becomes a challenge. To enable disease-mapping studies of increasingly large cohorts, a fast and accurate method to detect IBD segments is required.

We present PARENTE, a novel method for detecting related pairs of individuals and shared haplotypic segments within these pairs. PARENTE is a computationally-efficient method based on an embedded likelihood ratio test. As demonstrated by the results of our simulations, our method exhibits better accuracy than the current state of the art, and can be used for the analysis of large genotyped cohorts. PARENTE’s higher accuracy becomes even more significant in more challenging scenarios, such as detecting shorter IBD segments or when an extremely low false-positive rate is required. PARENTE is publicly and freely available at http://parente.stanford.edu/.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abecasis, G.R., Cherny, S.S., Cookson, W.O., Cardon, L.R.: Merlin–rapid analysis of dense genetic maps using sparse gene flow trees. Nat. Genet. 30(1), 97–101 (2002)
Article Google Scholar
Alkuraya, F.S.: Homozygosity mapping: one more tool in the clinical geneticist’s toolbox. Genet. Med. 12(4), 236–239 (2010)
Article Google Scholar
Altshuler, D.M., Gibbs, R.A., Peltonen, L., Dermitzakis, E., Schaffner, S.F., Yu, F., Bonnen, P.E., De Bakker, P.I.W., Deloukas, P., Gabriel, S.B., et al.: Integrating common and rare genetic variation in diverse human populations. Nature 467(7311), 52–58 (2010)
Article Google Scholar
Bercovici, S., Meek, C., Wexler, Y., Geiger, D.: Estimating genome-wide ibd sharing from snp data via an efficient hidden markov model of ld with application to gene mapping. Bioinformatics 26(12), i175–i182 (2010)
Google Scholar
Browning, B.L., Browning, S.R.: A fast, powerful method for detecting identity by descent. American Journal of Human Genetics 88(2), 173–182 (2011)
Article MathSciNet Google Scholar
Browning, S., Browning, B.: Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am. J. Hum. Genet. 81(5), 1084–1097 (2007)
Article Google Scholar
Browning, S., Thompson, E.: Detecting Rare Variant Associations by Identity by Descent Mapping in Case-control Studies. Genetics 190, 1521–1531 (2012)
Article Google Scholar
Browning, S.R., Browning, B.L.: High-Resolution Detection of Identity by Descent in Unrelated Individuals. American Journal of Human Genetics 86(4), 526–539 (2010)
Article Google Scholar
Carey, V.J.: Mathematical and statistical methods for genetic analysis (2nd ed.). kenneth lange. Journal of the American Statistical Association 100, 712 (2005)
Google Scholar
Conrad, D.F., Keebler, J.E.M., DePristo, M.A., Lindsay, S.J., Zhang, Y., Casals, F., Idaghdour, Y., Hartl, C.L., Torroja, C., Garimella, K.V., Zilversmit, M., Cartwright, R., Rouleau, G.A., Daly, M., Stone, E.A., Hurles, M.E., Awadalla, P., for the 1000 Genomes Project: Variation in genome-wide mutation rates within and between human families. Nature Genetics (2011)
Google Scholar
Elston, R., Stewart, J.: A general model for the analysis of pedigree data. Hum. Hered. 21, 523–542 (1971)
Article Google Scholar
Ghahramani, Z., Jordan, M.I., Smyth, P.: Factorial hidden markov models. In: Machine Learning. MIT Press (1997)
Google Scholar
Gudbjartsson, D.F., Thorvaldsson, T., Kong, A., Gunnarsson, G., Ingolfsdottir, A.: Allegro version 2. Nature Genetics 37(10), 1015–1016 (2005)
Article Google Scholar
Gusev, A., Lowe, J.K., Stoffel, M., Daly, M.J., Altshuler, D., Breslow, J.L., Friedman, J.M., Pe’er, I.: Whole population, genome-wide mapping of hidden relatedness. Genome Research 19, 318–326 (2009), doi:10.1101/gr.081398.108
Article Google Scholar
Henn, B.M., Hon, L., Macpherson, J.M., Eriksson, N., Saxonov, S., Pe’er, I., Mountain, J.L.: Cryptic distant relatives are common in both isolated and cosmopolitan genetic samples. PLoS ONE 7(4), e34267 (2012)
Google Scholar
Ingólfsdóttir, A., Gudbjartsson, D.: Genetic Linkage Analysis Algorithms and Their Implementation. In: Priami, C., Merelli, E., Gonzalez, P., Omicini, A. (eds.) Transactions on Computational Systems Biology III. LNCS (LNBI), vol. 3737, pp. 123–144. Springer, Heidelberg (2005)
Chapter Google Scholar
Kyriazopoulou-Panagiotopoulou, S., Kashef Haghighi, D., Aerni, S.J., Sundquist, A., Bercovici, S., Batzoglou, S.: Reconstruction of genealogical relationships with applications to phase iii of hapmap. Bioinformatics 27(13), i333–i341 (2011)
Google Scholar
Lander, E.S., Green, P.: Construction of multilocus genetic maps in humans. Proceedings of the National Academy of Sciences 84, 2363–2367 (1987)
Article Google Scholar
Li, M.-H., Strandén, I., Tiirikka, T., Sevón-Aimonen, M.-L., Kantanen, J.: A comparison of approaches to estimate the inbreeding coefficient and pairwise relatedness using genomic and pedigree data in a sheep population. PLoS ONE 6(11), e26256 (2011)
Google Scholar
Markianos, K., Daly, M.J., Kruglyak, L.: Efficient multipoint linkage analysis through reduction of inheritance space. Am. J. Hum. Genet. 68(4), 963–977 (2001)
Article Google Scholar
1000 Genomes Project. A map of human genome variation from population-scale sequencing. Nature 467(7319),1061–1073 (2010)
Google Scholar
Moltke, I., Albrechtsen, A., Thomas, Nielsen, F.C., Nielsen, R.: A method for detecting IBD regions simultaneously in multiple individuals with applications to disease genetics. Genome Research 21(7), 1168–1180 (2011)
Article Google Scholar
Nalls, M.A., Simon-Sanchez, J., Gibbs, J.R., Paisan-Ruiz, C., Bras, J.T., Tanaka, T., Matarin, M., Scholz, S., Weitz, C., Harris, T.B., Ferrucci, L., Hardy, J., Singleton, A.B.: Measures of autozygosity in decline: Globalization, urbanization, and its implications for medical genetics. PLoS Genet 5(3), e1000415 (2009)
Google Scholar
Ott, J.: Analysis of Human Genetic Linkage. The Johns Hopkins series in contemporary medicine and public health. Johns Hopkins University Press (1999)
Google Scholar
Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M.A., Bender, D., Maller, J., Sklar, P., de Bakker, P.I., Daly, M.J., Sham, P.C.: PLINK: a tool set for whole-genome association and population-based linkage analyses. American Journal of Human Genetics 81(3), 559–575 (2007)
Article Google Scholar
Rabiner, L.R.: A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE, 257–286 (1989)
Google Scholar
Ralph, P., Coop, G.: The geography of recent genetic ancestry across Europe (July 2012)
Google Scholar
WTCCC. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447(7145), 661–678 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Stanford University, USA
Jesse M. Rodriguez, Serafim Batzoglou & Sivan Bercovici
Biomedical Informatics Program, Stanford University, USA
Jesse M. Rodriguez

Authors

Jesse M. Rodriguez
View author publications
You can also search for this author in PubMed Google Scholar
Serafim Batzoglou
View author publications
You can also search for this author in PubMed Google Scholar
Sivan Bercovici
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Mathematics, Peking University, Beijing, P.R. China
Minghua Deng
Bioinformatics Division, TNLIST/Department of Automation, Tsinghua University, 100084, Beijing, P.R. China
Rui Jiang
Molecular and Computational Biology Program, University of Southern California, Los Angeles, California, USA
Fengzhu Sun
Department of Automation, Tsinghua University, P.O. Box, 100084, Beijing, China
Xuegong Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rodriguez, J.M., Batzoglou, S., Bercovici, S. (2013). An Accurate Method for Inferring Relatedness in Large Datasets of Unphased Genotypes via an Embedded Likelihood-Ratio Test. In: Deng, M., Jiang, R., Sun, F., Zhang, X. (eds) Research in Computational Molecular Biology. RECOMB 2013. Lecture Notes in Computer Science(), vol 7821. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37195-0_18

Download citation

DOI: https://doi.org/10.1007/978-3-642-37195-0_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37194-3
Online ISBN: 978-3-642-37195-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics