Invited Keynote Talk: Set-Level Analyses for Genome-Wide Association Data

Nicolae, Dan L.; De la Cruz, Omar; Wen, William; Ke, Baoguan; Song, Minsun

doi:10.1007/978-3-540-79450-9_1

Dan L. Nicolae¹,
Omar De la Cruz²,
William Wen²,
Baoguan Ke² &
…
Minsun Song²

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4983))

Included in the following conference series:

International Symposium on Bioinformatics Research and Applications

940 Accesses
1 Citations

Abstract

High-throughput genotyping platforms allow the investigation of hundreds of thousands of markers at a time, and this has led to a growing number of genome-wide association studies in which the entire human genome is mined for genes involved in etiology of complex traits. This approach for discovery of genetic risk factors has yielded promising results, but most of the analyses have focused on single marker tests. In general, a method of analysis that uses the markers as if they are biologically unrelated throws away all the information contained in the structure of the genome.

In this paper, we propose a method for incorporating structural genomic information by grouping the markers in relevant units, and assigning a measure of significance to these pre-defined sets of markers. The sets can be genes, conserved regions, or groups of genes such as pathways. Using the proposed methods and algorithms, evidence for association between a particular functional unit and a disease status can be obtained not just by the presence of a strong signal from a SNP within it, but also by the combination of several simultaneous weaker signals that are uncorrelated. Note that the method will combine evidence for association from both the genotyped and the untyped markers. The untyped markers are tested using haplotype predictors for their alleles, with the prediction training done in reference databases such as HapMap.

There are several advantages in using this approach. There is an increase in the power of detecting genes associated to disease because moderately strong signals within a gene are combined to obtain a much stronger signal for the gene as a functional unit. The results are easily combined across platforms that use different sets of SNP. Lastly, the results are easy to interpret since the refer to functional regions, and they also provide targets for biological validation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Author information

Authors and Affiliations

Departments of Medicine and Statistics, The University of Chicago,
Dan L. Nicolae
Department of Statistics, The University of Chicago, 5734 S. University Ave., Chicago, IL 60637,
Omar De la Cruz, William Wen, Baoguan Ke & Minsun Song

Authors

Dan L. Nicolae
View author publications
You can also search for this author in PubMed Google Scholar
Omar De la Cruz
View author publications
You can also search for this author in PubMed Google Scholar
William Wen
View author publications
You can also search for this author in PubMed Google Scholar
Baoguan Ke
View author publications
You can also search for this author in PubMed Google Scholar
Minsun Song
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Ion Măndoiu Raj Sunderraman Alexander Zelikovsky

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nicolae, D.L., De la Cruz, O., Wen, W., Ke, B., Song, M. (2008). Invited Keynote Talk: Set-Level Analyses for Genome-Wide Association Data. In: Măndoiu, I., Sunderraman, R., Zelikovsky, A. (eds) Bioinformatics Research and Applications. ISBRA 2008. Lecture Notes in Computer Science(), vol 4983. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-79450-9_1

Download citation

DOI: https://doi.org/10.1007/978-3-540-79450-9_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-79449-3
Online ISBN: 978-3-540-79450-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics