Abstract
Genetic differences between individuals associated to quantitative phenotypic traits, including disease states, are usually found in noncoding genomic regions. These genetic variants are often also associated to differences in expression levels of nearby genes (they are “expression quantitative trait loci” or eQTLs, for short) and presumably play a gene regulatory role, affecting the status of molecular networks of interacting genes, proteins, and metabolites. Computational systems biology approaches to reconstruct causal gene networks from large-scale omics data have therefore become essential to understand the structure of networks controlled by eQTLs together with other regulatory genes, as well as to generate detailed hypotheses about the molecular mechanisms that lead from genotype to phenotype. Here we review the main analytical methods and software to identify eQTLs and their associated genes, to reconstruct coexpression networks and modules, to reconstruct causal Bayesian gene and module networks, and to validate predicted networks in silico.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
There are only matrix multiplications, because the data standardization implies that .
References
Albert FW, Kruglyak L (2015) The role of regulatory variation in complex traits and disease. Nat Rev Genet 16:197–212
Ardlie KG et al (2015) The genotype-tissue expression (GTEx) pilot analysis: multitissue gene regulation in humans. Science 348:648–660
Ashburner M et al (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25:25–29
Aten JE et al (2008) Using genetic markers to orient the edges in quantitative trait networks: the NEO software. BMC Syst Biol 2:34
Ayroles JF et al (2009) Systems genetics of complex traits in drosophila melanogaster. Nat Genet 41:299–307
Basso K et al (2005) Reverse engineering of regulatory networks in human b cells. Nat Genet 37:382–390
Björkegren JL et al (2015) Genome-wide significant loci: how important are they?: systems genetics to understand heritability of coronary artery disease and other common complex disorders. J Am Coll Cardiol 65:830–845
Bonnet E, Calzone L, Michoel T (2015) Integrative multi-omics module network inference with Lemon-Tree. PLoS Comput Biol 11, e1003983
Brem RB et al (2002) Genetic dissection of transcriptional regulation in budding yeast. Science 296:752–755
Butte A, Kohane I (2000) Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. Pac Symp Biocompu 5:415–426
Cenik C et al (2015) Integrative analysis of rna, translation and protein levels reveals distinct regulatory variation across humans. Genome Res. doi:10.1101/gr.193342.115
Chatr-Aryamontri A et al (2015) The BioGRID interaction database: 2015 update. Nucleic Acids Res 43(Database issue):D470–D478. doi:10.1093/nar/gku1204
Chen LS, Emmert-Streib F, Storey JD (2007) Harnessing naturally randomized transcription to infer regulatory relationships among genes. Genome Biol 8:R219
Chen Y et al (2008) Variations in DNA elucidate molecular networks that cause disease. Nature 452:429–435
Cheung VG, Spielman RS (2009) Genetics of human gene expression: mapping dna variants that influence gene expression. Nat Rev Genet 10:595–604
Civelek M, Lusis AJ (2014) Systems genetics approaches to understand complex traits. Nat Rev Genet 15:34–48
Clauset A, Newman MEJ, Moore C (2004) Finding community structure in very large networks. Phys Rev E 70:066111
Cookson W et al (2009) Mapping complex disease traits with global gene expression. Nat Rev Genet 10:184–194
Cubillos FA, Coustham V, Loudet O (2012) Lessons from eQTL mapping studies: non-coding regions and their role behind natural phenotypic variation in plants. Curr Opin Plant Biol 15:192–198
Cusanovich DA et al (2014) The functional consequences of variation in transcription factor binding. PLoS Genet 10, e1004226
Daub CO et al (2004) Estimating mutual information using B-spline functions – an improved similarity measure for analysing gene expression data. BMC Bioinf 5:118
Dimas AS et al (2009) Common regulatory variation impacts gene expression in a cell type–dependent manner. Science 325:1246–1250
Eisen MB et al (1998) Cluster analysis and display of genome-wide expression patterns. PNAS 95:14863–14868
Enright AJ, Van Dongen S, Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res 30:1575–1584
Faith JJ et al (2007) Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol 5, e8
Foroughi Asl H et al (2015) Expression quantitative trait loci acting across multiple tissues are enriched in inherited risk of coronary artery disease. Circulation Cardiovasc Genet 8:305–315
Foss EJ et al (2007) Genetic basis of proteome variation in yeast. Nat Genet 39:1369–1375
Friedman N (2004) Inferring cellular networks using probabilistic graphical models. Science 308:799–805
Friedman N, Nachman I, Peér D (1999) Learning bayesian network structure from massive datasets: the “sparse candidate” algorithm. In Proceedings of the fifteenth conference on uncertainty in artificial intelligence, UAI’99. Morgan Kaufmann Publishers Inc., San Francisco, pp 206–215
Friedman N, Goldszmidt M, Wyner A (1999b) Data analysis with Bayesian networks: a bootstrap approach. In: Proceedings of the fifteenth conference on uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc, San Francisco, pp 196–205
Friedman N et al (2000) Using Bayesian networks to analyze expression data. J Comput Biol 7:601–620
Furey TS (2012) ChIP–seq and beyond: new and improved methodologies to detect and characterize protein–DNA interactions. Nat Rev Genet 13:840–852
Georges M (2007) Mapping, fine mapping, and molecular dissection of quantitative trait loci in domestic animals. Annu Rev Genomics Hum Genet 8:131–162
Gerstein M et al (2010) Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project. Science 330:1775–1787
Goddard ME, Hayes BJ (2009) Mapping genes for complex traits in domestic animals and their use in breeding programmes. Nat Rev Genet 10:381–391
Golub GH, Van Loan CF (1996) Matrix computations, 3rd edn. The Johns Hopkins University Press, Baltimore
Greenawalt DM et al (2011) A survey of the genetics of stomach, liver, and adipose gene expression from a morbidly obese cohort. Genome Res 21:1008–1016
Grubert F et al (2015) Genetic control of chromatin states in humans involves local and distal chromosomal interactions. Cell 162:1051–1065
Hartwell LH et al (1999) From molecular to modular cell biology. Nature 402:C47–C52
Hemani G et al (2014) Detection and replication of epistasis influencing transcription in humans. Nature 508:249–253
Hindorff LA et al (2009) Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci 106:9362–9367
Joshi A, Van de Peer Y, Michoel T (2008) Analysis of a Gibbs sampler for model based clustering of gene expression data. Bioinformatics 24:176–183
Joshi A et al (2009) Module networks revisited: computational assessment and prioritization of model predictions. Bioinformatics 25:490–496
Kadarmideen HN, von Rohr P, Janss LL (2006) From genetical genomics to systems genetics: potential applications in quantitative genomics and animal breeding. Mamm Genome 17:548–564
Koller D, Friedman N (2009) Probabilistic graphical models: principles and techniques. The MIT Press, Cambridge, MA
Kundaje A et al (2015) Integrative analysis of 111 reference human epigenomes. Nature 518:317–330
Laird N, Lange C (2011) The fundamentals of modern statistical genetics. Springer, New York
Langfelder P, Horvath S (2007) Eigengene networks for studying the relationships between co-expression modules. BMC Syst Biol 1:54
Langfelder P, Horvath S (2008) Wgcna: an r package for weighted correlation network analysis. BMC Bioinf 9:559
Langfelder P, Zhang B, Horvath S (2008) Defining clusters from a hierarchical cluster tree: the dynamic tree cut package for r. Bioinformatics 24:719–720
Lappalainen T et al (2013) Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501:506–511
Lee S et al (2006) Identifying regulatory mechanisms using individual variation reveals key role for chromatin modification. Proc Natl Acad Sci U S A 103:14062–14067
Lee SI et al (2009) Learning a prior on regulatory potential from eqtl data. PLoS Genet 5, e1000358
Li Y et al (2010) Critical reasoning on causal inference in genome-wide linkage and association studies. Trends Genet 26:493–498
Liu JS (2002) Monte Carlo strategies in scientific computing. Springer, New York
Lu P et al (2007) Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation. Nat Biotech 25:117–124
Mackay TF, Stone EA, Ayroles JF (2009) The genetics of quantitative traits: challenges and prospects. Nat Rev Genet 10:565–577
Manolio TA (2013) Bringing genome-wide association findings into clinical use. Nat Rev Genet 14:549–558
Medvedovic M, Sivaganesan S (2002) Bayesian infinite mixture model based clustering of gene expression profiles. Bioinformatics 18:1194–1206
Michoel T, Nachtergaele B (2012) Alignment and integration of complex networks by hypergraph-based spectral clustering. Phys Rev E 86:056111
Millstein J et al (2009) Disentangling molecular relationships with a causal inference test. BMC Genet 10:23
Neto EC et al (2008) Inferring causal phenotype networks from segregating populations. Genetics 179:1089–1100
Neto EC et al (2010) Causal graphical models in systems genetics: a unified framework for joint inference of causal network and genetic architecture for correlated phenotypes. Ann Appl Stat 4:320
Neto EC et al (2013) Modeling causality for pairs of phenotypes in system genetics. Genetics 193:1003–1013
Newman MEJ (2006) Modularity and community structure in networks. PNAS 103:8577–8582
Newman MEJ, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69:026113
Nicholson G et al (2011) A genome-wide metabolic QTL analysis in Europeans implicates two loci shaped by recent positive selection. PLoS Genet 7, e1002270
Qi J et al (2014) kruX: Matrix-based non-parametric eQTL discovery. BMC Bioinf 15:11
Qu K et al (2016) Integrative genomic analysis by interoperation of bioinformatics tools in GenomeSpace. Nat Methods 13:245–247
Rao SS et al (2014) A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159:1665–1680
Ravasz E et al (2002) Hierarchical organization of modularity in metabolic networks. Science 297:1551–1555
Ritchie MD et al (2015) Methods of integrating data to uncover genotype-phenotype interactions. Nat Rev Genet 16:85–97
Rockman MV (2008) Reverse engineering the genotype–phenotype map with natural genetic variation. Nature 456:738–744
Rockman MV, Kruglyak L (2006) Genetics of global gene expression. Nat Rev Genet 7:862–872
Roy S et al (2010) Identification of functional elements and regulatory circuits by Drosophila modENCODE. Science 330:1787–1797
Schadt EE (2009) Molecular networks as sensors and drivers of common human diseases. Nature 461:218–223
Schadt EE, Björkegren JL (2012) New: network-enabled wisdom in biology, medicine, and health care. Sci Transl Med 4:115rv1
Schadt EE et al (2005) An integrative genomics approach to infer causal associations between gene expression and disease. Nat Genet 37:710–717
Schadt EE et al (2008) Mapping the genetic architecture of gene expression in human liver. PLoS Biol 6, e107
Schadt EE, Friend SH, Shaywitz DA (2009) A network view of disease and compound screening. Nat Rev Drug Disc 8:286–295
Schaub MA et al (2012) Linking disease associations with regulatory information in the human genome. Genome Res 22:1748–1759
Schmidt M, Niculescu-Mizil A, Murphy K (2007) Learning graphical model structure using L1-regularization paths. AAAI 7:1278–1283
Schwanhausser B et al (2011) Global quantification of mammalian gene expression control. Nature 473:337–342
Scutari M et al (2014) Multiple quantitative trait analysis using Bayesian networks. Genetics 198:129–137
Segal E et al (2003) Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nat Genet 34:166–167
Shabalin AA (2012) Matrix eQTL: ultra fast eQTL analysis via large matrix operations. Bioinformatics 28:1353–1358
Sharan R, Shamir R (2000) CLICK: a clustering algorithm with applications to gene expression analysis. In Proc Int Conf Intell Syst Mol Biol 8:16
Sharan R, Ulitsky I, Shamir R (2007) Network-based prediction of protein function. Mol Syst Biol 3:88
Smith GD, Ebrahim S (2003) ‘mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease? Int J Epidemiol 32:1–22
Stegle O et al (2012) Using probabilistic estimation of expression residuals (peer) to obtain increased power and interpretability of gene expression analyses. Nat Protoc 7:500–507
Talukdar H et al (2016) Cross-tissue regulatory gene networks in coronary artery disease. Cell Syst 2:196–208
Tavazoie S et al (1999) Systematic determination of genetic network architecture. Nat Genet 22:281–285
The ENCODE (2012) Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489:57–74
Van Dongen SM (2001) Graph clustering by flow simulation. Dissertation, Utrecht University Repository
Walhout AJ (2006) Unraveling transcription regulatory networks by protein–DNA and protein–protein interaction mapping. Genome Res 16:1445–1454
Waszak SM et al (2015) Population variation and genetic control of modular chromatin architecture in humans. Cell 162:1039–1050
Williams RW (2006) Expression genetics and the phenotype revolution. Mamm Genome 17:496–502
Wu L et al (2013) Variation and genetic control of protein abundance in humans. Nature 499:79–82
Yue F et al (2014) A comparative encyclopedia of DNA elements in the mouse genome. Nature 515:355–364
Zhang B, Horvath S (2005) A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol 4:17
Zhang W et al (2010) A Bayesian partition method for detecting pleiotropic and epistatic eQTL modules. PLoS Comput Biol 6, e1000642
Zhang B et al (2013) Integrated systems approach identifies genetic nodes and networks in late-onset Alzheimer’s disease. Cell 153:707–720
Zhu J et al (2004) An integrative genomics approach to the reconstruction of gene networks in segregating populations. Cytogenet Genome Res 105:363–374
Zhu J et al (2008) Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks. Nat Genet 40:854–861
Zhu J et al (2012) Stitching together multiple data dimensions reveals interacting metabolomic and transcriptomic networks that modulate cell regulation. PLoS Biol 10, e1001301
Acknowledgments
The authors’ work is supported by the BBSRC (BB/M020053/1) and Roslin Institute Strategic Grant funding from the BBSRC (BB/J004235/1).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Wang, L., Michoel, T. (2016). Detection of Regulator Genes and eQTLs in Gene Networks. In: Kadarmideen, H. (eds) Systems Biology in Animal Production and Health, Vol. 1. Springer, Cham. https://doi.org/10.1007/978-3-319-43335-6_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-43335-6_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-43333-2
Online ISBN: 978-3-319-43335-6
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)