Skip to main content

Detection of Regulator Genes and eQTLs in Gene Networks

  • Chapter
  • First Online:
Systems Biology in Animal Production and Health, Vol. 1

Abstract

Genetic differences between individuals associated to quantitative phenotypic traits, including disease states, are usually found in noncoding genomic regions. These genetic variants are often also associated to differences in expression levels of nearby genes (they are “expression quantitative trait loci” or eQTLs, for short) and presumably play a gene regulatory role, affecting the status of molecular networks of interacting genes, proteins, and metabolites. Computational systems biology approaches to reconstruct causal gene networks from large-scale omics data have therefore become essential to understand the structure of networks controlled by eQTLs together with other regulatory genes, as well as to generate detailed hypotheses about the molecular mechanisms that lead from genotype to phenotype. Here we review the main analytical methods and software to identify eQTLs and their associated genes, to reconstruct coexpression networks and modules, to reconstruct causal Bayesian gene and module networks, and to validate predicted networks in silico.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    There are only l 1 matrix multiplications, because the data standardization implies that X I ( 0 ) = 1 m = 1 l 1 X I ( m ) .

References

  • Albert FW, Kruglyak L (2015) The role of regulatory variation in complex traits and disease. Nat Rev Genet 16:197–212

    Article  CAS  PubMed  Google Scholar 

  • Ardlie KG et al (2015) The genotype-tissue expression (GTEx) pilot analysis: multitissue gene regulation in humans. Science 348:648–660

    Article  CAS  Google Scholar 

  • Ashburner M et al (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25:25–29

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Aten JE et al (2008) Using genetic markers to orient the edges in quantitative trait networks: the NEO software. BMC Syst Biol 2:34

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Ayroles JF et al (2009) Systems genetics of complex traits in drosophila melanogaster. Nat Genet 41:299–307

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Basso K et al (2005) Reverse engineering of regulatory networks in human b cells. Nat Genet 37:382–390

    Article  CAS  PubMed  Google Scholar 

  • Björkegren JL et al (2015) Genome-wide significant loci: how important are they?: systems genetics to understand heritability of coronary artery disease and other common complex disorders. J Am Coll Cardiol 65:830–845

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Bonnet E, Calzone L, Michoel T (2015) Integrative multi-omics module network inference with Lemon-Tree. PLoS Comput Biol 11, e1003983

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Brem RB et al (2002) Genetic dissection of transcriptional regulation in budding yeast. Science 296:752–755

    Article  CAS  PubMed  Google Scholar 

  • Butte A, Kohane I (2000) Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. Pac Symp Biocompu 5:415–426

    Google Scholar 

  • Cenik C et al (2015) Integrative analysis of rna, translation and protein levels reveals distinct regulatory variation across humans. Genome Res. doi:10.1101/gr.193342.115

    PubMed  PubMed Central  Google Scholar 

  • Chatr-Aryamontri A et al (2015) The BioGRID interaction database: 2015 update. Nucleic Acids Res 43(Database issue):D470–D478. doi:10.1093/nar/gku1204

    Article  PubMed  Google Scholar 

  • Chen LS, Emmert-Streib F, Storey JD (2007) Harnessing naturally randomized transcription to infer regulatory relationships among genes. Genome Biol 8:R219

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Chen Y et al (2008) Variations in DNA elucidate molecular networks that cause disease. Nature 452:429–435

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Cheung VG, Spielman RS (2009) Genetics of human gene expression: mapping dna variants that influence gene expression. Nat Rev Genet 10:595–604

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Civelek M, Lusis AJ (2014) Systems genetics approaches to understand complex traits. Nat Rev Genet 15:34–48

    Article  CAS  PubMed  Google Scholar 

  • Clauset A, Newman MEJ, Moore C (2004) Finding community structure in very large networks. Phys Rev E 70:066111

    Article  CAS  Google Scholar 

  • Cookson W et al (2009) Mapping complex disease traits with global gene expression. Nat Rev Genet 10:184–194

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Cubillos FA, Coustham V, Loudet O (2012) Lessons from eQTL mapping studies: non-coding regions and their role behind natural phenotypic variation in plants. Curr Opin Plant Biol 15:192–198

    Article  CAS  PubMed  Google Scholar 

  • Cusanovich DA et al (2014) The functional consequences of variation in transcription factor binding. PLoS Genet 10, e1004226

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Daub CO et al (2004) Estimating mutual information using B-spline functions – an improved similarity measure for analysing gene expression data. BMC Bioinf 5:118

    Article  CAS  Google Scholar 

  • Dimas AS et al (2009) Common regulatory variation impacts gene expression in a cell type–dependent manner. Science 325:1246–1250

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Eisen MB et al (1998) Cluster analysis and display of genome-wide expression patterns. PNAS 95:14863–14868

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Enright AJ, Van Dongen S, Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res 30:1575–1584

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Faith JJ et al (2007) Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol 5, e8

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Foroughi Asl H et al (2015) Expression quantitative trait loci acting across multiple tissues are enriched in inherited risk of coronary artery disease. Circulation Cardiovasc Genet 8:305–315

    Article  CAS  Google Scholar 

  • Foss EJ et al (2007) Genetic basis of proteome variation in yeast. Nat Genet 39:1369–1375

    Article  CAS  PubMed  Google Scholar 

  • Friedman N (2004) Inferring cellular networks using probabilistic graphical models. Science 308:799–805

    Article  CAS  Google Scholar 

  • Friedman N, Nachman I, Peér D (1999) Learning bayesian network structure from massive datasets: the “sparse candidate” algorithm. In Proceedings of the fifteenth conference on uncertainty in artificial intelligence, UAI’99. Morgan Kaufmann Publishers Inc., San Francisco, pp 206–215

    Google Scholar 

  • Friedman N, Goldszmidt M, Wyner A (1999b) Data analysis with Bayesian networks: a bootstrap approach. In: Proceedings of the fifteenth conference on uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc, San Francisco, pp 196–205

    Google Scholar 

  • Friedman N et al (2000) Using Bayesian networks to analyze expression data. J Comput Biol 7:601–620

    Article  CAS  PubMed  Google Scholar 

  • Furey TS (2012) ChIP–seq and beyond: new and improved methodologies to detect and characterize protein–DNA interactions. Nat Rev Genet 13:840–852

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Georges M (2007) Mapping, fine mapping, and molecular dissection of quantitative trait loci in domestic animals. Annu Rev Genomics Hum Genet 8:131–162

    Article  CAS  PubMed  Google Scholar 

  • Gerstein M et al (2010) Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project. Science 330:1775–1787

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Goddard ME, Hayes BJ (2009) Mapping genes for complex traits in domestic animals and their use in breeding programmes. Nat Rev Genet 10:381–391

    Article  CAS  PubMed  Google Scholar 

  • Golub GH, Van Loan CF (1996) Matrix computations, 3rd edn. The Johns Hopkins University Press, Baltimore

    Google Scholar 

  • Greenawalt DM et al (2011) A survey of the genetics of stomach, liver, and adipose gene expression from a morbidly obese cohort. Genome Res 21:1008–1016

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Grubert F et al (2015) Genetic control of chromatin states in humans involves local and distal chromosomal interactions. Cell 162:1051–1065

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Hartwell LH et al (1999) From molecular to modular cell biology. Nature 402:C47–C52

    Article  CAS  PubMed  Google Scholar 

  • Hemani G et al (2014) Detection and replication of epistasis influencing transcription in humans. Nature 508:249–253

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Hindorff LA et al (2009) Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci 106:9362–9367

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Joshi A, Van de Peer Y, Michoel T (2008) Analysis of a Gibbs sampler for model based clustering of gene expression data. Bioinformatics 24:176–183

    Article  CAS  PubMed  Google Scholar 

  • Joshi A et al (2009) Module networks revisited: computational assessment and prioritization of model predictions. Bioinformatics 25:490–496

    Article  CAS  PubMed  Google Scholar 

  • Kadarmideen HN, von Rohr P, Janss LL (2006) From genetical genomics to systems genetics: potential applications in quantitative genomics and animal breeding. Mamm Genome 17:548–564

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Koller D, Friedman N (2009) Probabilistic graphical models: principles and techniques. The MIT Press, Cambridge, MA

    Google Scholar 

  • Kundaje A et al (2015) Integrative analysis of 111 reference human epigenomes. Nature 518:317–330

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Laird N, Lange C (2011) The fundamentals of modern statistical genetics. Springer, New York

    Book  Google Scholar 

  • Langfelder P, Horvath S (2007) Eigengene networks for studying the relationships between co-expression modules. BMC Syst Biol 1:54

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Langfelder P, Horvath S (2008) Wgcna: an r package for weighted correlation network analysis. BMC Bioinf 9:559

    Article  CAS  Google Scholar 

  • Langfelder P, Zhang B, Horvath S (2008) Defining clusters from a hierarchical cluster tree: the dynamic tree cut package for r. Bioinformatics 24:719–720

    Article  CAS  PubMed  Google Scholar 

  • Lappalainen T et al (2013) Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501:506–511

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Lee S et al (2006) Identifying regulatory mechanisms using individual variation reveals key role for chromatin modification. Proc Natl Acad Sci U S A 103:14062–14067

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Lee SI et al (2009) Learning a prior on regulatory potential from eqtl data. PLoS Genet 5, e1000358

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Li Y et al (2010) Critical reasoning on causal inference in genome-wide linkage and association studies. Trends Genet 26:493–498

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Liu JS (2002) Monte Carlo strategies in scientific computing. Springer, New York

    Google Scholar 

  • Lu P et al (2007) Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation. Nat Biotech 25:117–124

    Article  CAS  Google Scholar 

  • Mackay TF, Stone EA, Ayroles JF (2009) The genetics of quantitative traits: challenges and prospects. Nat Rev Genet 10:565–577

    Article  CAS  PubMed  Google Scholar 

  • Manolio TA (2013) Bringing genome-wide association findings into clinical use. Nat Rev Genet 14:549–558

    Article  CAS  PubMed  Google Scholar 

  • Medvedovic M, Sivaganesan S (2002) Bayesian infinite mixture model based clustering of gene expression profiles. Bioinformatics 18:1194–1206

    Article  CAS  PubMed  Google Scholar 

  • Michoel T, Nachtergaele B (2012) Alignment and integration of complex networks by hypergraph-based spectral clustering. Phys Rev E 86:056111

    Article  CAS  Google Scholar 

  • Millstein J et al (2009) Disentangling molecular relationships with a causal inference test. BMC Genet 10:23

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Neto EC et al (2008) Inferring causal phenotype networks from segregating populations. Genetics 179:1089–1100

    Article  Google Scholar 

  • Neto EC et al (2010) Causal graphical models in systems genetics: a unified framework for joint inference of causal network and genetic architecture for correlated phenotypes. Ann Appl Stat 4:320

    Article  PubMed  PubMed Central  Google Scholar 

  • Neto EC et al (2013) Modeling causality for pairs of phenotypes in system genetics. Genetics 193:1003–1013

    Article  PubMed  PubMed Central  Google Scholar 

  • Newman MEJ (2006) Modularity and community structure in networks. PNAS 103:8577–8582

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Newman MEJ, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69:026113

    Article  CAS  Google Scholar 

  • Nicholson G et al (2011) A genome-wide metabolic QTL analysis in Europeans implicates two loci shaped by recent positive selection. PLoS Genet 7, e1002270

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Qi J et al (2014) kruX: Matrix-based non-parametric eQTL discovery. BMC Bioinf 15:11

    Article  CAS  Google Scholar 

  • Qu K et al (2016) Integrative genomic analysis by interoperation of bioinformatics tools in GenomeSpace. Nat Methods 13:245–247

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Rao SS et al (2014) A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159:1665–1680

    Article  CAS  PubMed  Google Scholar 

  • Ravasz E et al (2002) Hierarchical organization of modularity in metabolic networks. Science 297:1551–1555

    Article  CAS  PubMed  Google Scholar 

  • Ritchie MD et al (2015) Methods of integrating data to uncover genotype-phenotype interactions. Nat Rev Genet 16:85–97

    Article  CAS  PubMed  Google Scholar 

  • Rockman MV (2008) Reverse engineering the genotype–phenotype map with natural genetic variation. Nature 456:738–744

    Article  CAS  PubMed  Google Scholar 

  • Rockman MV, Kruglyak L (2006) Genetics of global gene expression. Nat Rev Genet 7:862–872

    Article  CAS  PubMed  Google Scholar 

  • Roy S et al (2010) Identification of functional elements and regulatory circuits by Drosophila modENCODE. Science 330:1787–1797

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Schadt EE (2009) Molecular networks as sensors and drivers of common human diseases. Nature 461:218–223

    Article  CAS  PubMed  Google Scholar 

  • Schadt EE, Björkegren JL (2012) New: network-enabled wisdom in biology, medicine, and health care. Sci Transl Med 4:115rv1

    Article  PubMed  CAS  Google Scholar 

  • Schadt EE et al (2005) An integrative genomics approach to infer causal associations between gene expression and disease. Nat Genet 37:710–717

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Schadt EE et al (2008) Mapping the genetic architecture of gene expression in human liver. PLoS Biol 6, e107

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Schadt EE, Friend SH, Shaywitz DA (2009) A network view of disease and compound screening. Nat Rev Drug Disc 8:286–295

    Article  CAS  Google Scholar 

  • Schaub MA et al (2012) Linking disease associations with regulatory information in the human genome. Genome Res 22:1748–1759

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Schmidt M, Niculescu-Mizil A, Murphy K (2007) Learning graphical model structure using L1-regularization paths. AAAI 7:1278–1283

    Google Scholar 

  • Schwanhausser B et al (2011) Global quantification of mammalian gene expression control. Nature 473:337–342

    Article  PubMed  CAS  Google Scholar 

  • Scutari M et al (2014) Multiple quantitative trait analysis using Bayesian networks. Genetics 198:129–137

    Article  PubMed  PubMed Central  Google Scholar 

  • Segal E et al (2003) Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nat Genet 34:166–167

    Article  CAS  PubMed  Google Scholar 

  • Shabalin AA (2012) Matrix eQTL: ultra fast eQTL analysis via large matrix operations. Bioinformatics 28:1353–1358

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Sharan R, Shamir R (2000) CLICK: a clustering algorithm with applications to gene expression analysis. In Proc Int Conf Intell Syst Mol Biol 8:16

    Google Scholar 

  • Sharan R, Ulitsky I, Shamir R (2007) Network-based prediction of protein function. Mol Syst Biol 3:88

    Article  PubMed  PubMed Central  Google Scholar 

  • Smith GD, Ebrahim S (2003) ‘mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease? Int J Epidemiol 32:1–22

    Article  PubMed  Google Scholar 

  • Stegle O et al (2012) Using probabilistic estimation of expression residuals (peer) to obtain increased power and interpretability of gene expression analyses. Nat Protoc 7:500–507

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Talukdar H et al (2016) Cross-tissue regulatory gene networks in coronary artery disease. Cell Syst 2:196–208

    Article  PubMed  PubMed Central  Google Scholar 

  • Tavazoie S et al (1999) Systematic determination of genetic network architecture. Nat Genet 22:281–285

    Article  CAS  PubMed  Google Scholar 

  • The ENCODE (2012) Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489:57–74

    Article  CAS  Google Scholar 

  • Van Dongen SM (2001) Graph clustering by flow simulation. Dissertation, Utrecht University Repository

    Google Scholar 

  • Walhout AJ (2006) Unraveling transcription regulatory networks by protein–DNA and protein–protein interaction mapping. Genome Res 16:1445–1454

    Article  CAS  PubMed  Google Scholar 

  • Waszak SM et al (2015) Population variation and genetic control of modular chromatin architecture in humans. Cell 162:1039–1050

    Article  CAS  PubMed  Google Scholar 

  • Williams RW (2006) Expression genetics and the phenotype revolution. Mamm Genome 17:496–502

    Article  PubMed  Google Scholar 

  • Wu L et al (2013) Variation and genetic control of protein abundance in humans. Nature 499:79–82

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Yue F et al (2014) A comparative encyclopedia of DNA elements in the mouse genome. Nature 515:355–364

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Zhang B, Horvath S (2005) A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol 4:17

    Google Scholar 

  • Zhang W et al (2010) A Bayesian partition method for detecting pleiotropic and epistatic eQTL modules. PLoS Comput Biol 6, e1000642

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Zhang B et al (2013) Integrated systems approach identifies genetic nodes and networks in late-onset Alzheimer’s disease. Cell 153:707–720

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Zhu J et al (2004) An integrative genomics approach to the reconstruction of gene networks in segregating populations. Cytogenet Genome Res 105:363–374

    Article  CAS  PubMed  Google Scholar 

  • Zhu J et al (2008) Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks. Nat Genet 40:854–861

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Zhu J et al (2012) Stitching together multiple data dimensions reveals interacting metabolomic and transcriptomic networks that modulate cell regulation. PLoS Biol 10, e1001301

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgments

The authors’ work is supported by the BBSRC (BB/M020053/1) and Roslin Institute Strategic Grant funding from the BBSRC (BB/J004235/1).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tom Michoel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Wang, L., Michoel, T. (2016). Detection of Regulator Genes and eQTLs in Gene Networks. In: Kadarmideen, H. (eds) Systems Biology in Animal Production and Health, Vol. 1. Springer, Cham. https://doi.org/10.1007/978-3-319-43335-6_1

Download citation

Publish with us

Policies and ethics