Skip to main content

A Bioinformatic Platform for a Bayesian, Multiphased, Multilevel Analysis in Immunogenomics

  • Chapter
  • First Online:
Bioinformatics for Immunomics

Part of the book series: Immunomics Reviews: ((IMMUN,volume 3))

Abstract

The accumulation of electronically accessible data and knowledge are posing theoretical and practical challenges for study design and statistical data analysis. It consists of the use of the results of earlier high-throughput measurements of genetic variations, microRNA, and gene expression levels, and the use of the biological knowledge bases. We investigate fusion in the phases of study design, data analysis, and interpretation; specifically, we present methodologies and bioinformatic tools in the Bayesian framework to deepen, lengthen, and broaden this fusion. First, we overview a Bayesian decision support for design of partial genetic association studies (GASs) incorporating domain literature, knowledge bases, and results of analysis of earlier studies. Second, we present a Bayesian multilevel analysis (BMLA) for GAS, which performs an integrated analysis at the univariate and multivariate levels, and at the level of interactions. Third, we present a Bayesian logic to support interpretation, which integrates the results of data analysis and factual domain knowledge. Finally, we discuss the advantages of the Bayesian framework to cope with small sample size, fusion of data and knowledge, challenges of multiple testing, meta-analysis, and positive results bias (i.e., the communication of scientific uncertainty). The genomics of asthma will serve as an application domain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Aerts S et al (2006) Gene prioritization through genomic data fusion. Nature 24:537–544

    Article  CAS  Google Scholar 

  • Ananiadou S, Mcnaught J (2006) Text mining for biology and biomedicine, Artech House

    Google Scholar 

  • Antal P, Millinghoffer (2006) A literature mining using Bayesian networks. In Proceedings of third European workshop on probabilistic graphical models, Prague, pp 17–24

    Google Scholar 

  • Antal P, Fannes G, Moreau Y, Timmerman D, DeMoor B (2004) Using literature and data to learn Bayesian networks as clinical models of ovarian tumors. Artif Intell Med 30:257–281

    Article  PubMed  Google Scholar 

  • Antal P, Gezsi A, Hullam G, Millinghoffer A (2006) Learning complex Bayesian network features for classification. In: Proceedings of third European workshop on probabilistic graphical models, Prague, pp 9–16

    Google Scholar 

  • Balding DJ (2006) A tutorial on statistical methods for population association studies. Nat Rev Genet 7:781–791

    Article  CAS  PubMed  Google Scholar 

  • Beckman Coulter – SNPStream: http://www.beckmancoulter.com/products/instrument/geneticanalysis/ceq/genomelab_snpstream_dcr.asp

  • Bonis J et al (2006) OSIRIS: A tool for retrieving literature about sequence variants. Bioinformatics 22(20):2567–2569

    Article  CAS  PubMed  Google Scholar 

  • Boutilier C, Friedman N, Goldszmidt M, Koller D (1996) Context-Specific Independence in Bayesian Networks, Proc. of the 20th Conf. on Uncertainty in Artificial Intelligence ({UAI}-1996), 115–123

    Google Scholar 

  • Cooper GF, Herskovits E (1992) A Bayesian method for the induction of probabilistic networks from data. Machine Learning, 9:309–347

    Google Scholar 

  • Couzin J (2008) MicroRNAs make big impression in disease after disease. Science 319:1782–1784

    Article  CAS  PubMed  Google Scholar 

  • De La Vega FM et al (2006) A tool for selecting SNPs for association studies based on observed linkage disequilibrium patterns. Pac Symp Biocomput 11:487–498

    Article  Google Scholar 

  • Denison DGT, Holmes CC, Mallick BK, Smith AFM (2002) Bayesian Methods for Nonlinear Classification and Regression. Wiley & Sons

    Google Scholar 

  • Estivill X, Armengol L (2007) Copy number variants and common disorders: Filling the gaps and exploring complexity in genome-wide association studies. PLoS Genet 3:1787–1799

    Article  CAS  PubMed  Google Scholar 

  • Franke A et al (2006) Genomizer: An integrated analysis system for genome wide association data. Hum Mutat 27(6):583–588

    Article  CAS  PubMed  Google Scholar 

  • Friedman N (2003) Inferring cellular networks using probabilistic graphical models. Science 303(5659):799–805

    Article  CAS  PubMed  Google Scholar 

  • Friedman N, Koller D (2003) Being Bayesian about network structure. Mach Learn 50(2):95–125

    Article  Google Scholar 

  • Gamerman D (1997) Markov Chain Monte Carlo. Chapman & Hall, London

    Google Scholar 

  • Gelman A, Carlin JB, Stern HS, Rubin DB (1995) Bayesian data analysis. Chapman & Hall, London

    Google Scholar 

  • Gerstein M, Junker J (2001) Blurring the boundaries between scientific “papers” and biological databases. Nature (web debate, on-line 7 May 2001)

    Google Scholar 

  • Giudici P, Castelo R (2003) Improving Markov Chain Monte Carlo model search for data mining. Machine Learning, 50:127–158

    Article  Google Scholar 

  • Grover D et al (2007) QuickSNP: An automated web server for selection of tagSNPs. Nucleic Acids Res 35:W115–W120

    Article  PubMed  Google Scholar 

  • Gu S et al (2005) HAPLOT: A graphical comparison of haplotype blocks, tagSNP sets and SNP variation for multiple populations. Bioinformatics 21(20):3938–3939

    Article  CAS  PubMed  Google Scholar 

  • Ingenuity Systems (2007) Ingenuity pathways analysis

    Google Scholar 

  • Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97:273–324

    Article  Google Scholar 

  • Moffatt MF et al (2007) Genetic variants regulating ORMDL3 expression contribute to the risk of childhood asthma. Nature 448:470–473

    Article  CAS  PubMed  Google Scholar 

  • Pearl J (1988) Probabilistic reasoning in intelligent systems. Morgan Kaufmann, San Francisco

    Google Scholar 

  • Peer D, Regev A, Elidan G, Friedman N (2001) Inferring subnetworks from perturbed expression profiles. Bioinformatics, Proc. of ISMB, 17(Suppl. 1):215–224

    Google Scholar 

  • Petretto E, Liu ET, Aitman TJ (2007) A gene harvest revealing the archeology and complexity of human disease. Nat Genet 39:1299–1301

    Article  CAS  PubMed  Google Scholar 

  • Pettersson F et al (2004) GOLDsurfer: Three dimensional display of linkage disequilibrium. Bioinformatics 20(17):3241–3243

    Article  CAS  PubMed  Google Scholar 

  • Russel S, Norvig P (2001) Artificial intelligence. Prentice Hall

    Google Scholar 

  • Shriner D, Vaughan LK, Padilla MA, Tiwari HK (2007) Problems with genome-wide association studies. Science 316:1840–1842

    Article  CAS  PubMed  Google Scholar 

  • Szalai C, Ungvári I, Pelyhe L, Tölgyesi G, Falus A (2008) Asthma from a pharmacogenomic point of view. Br J Pharmacol 153:1602–1614

    Article  CAS  PubMed  Google Scholar 

  • Wang L et al (2005) SNPHunter a bioinformatic software for single nucleotide polymorphism data acquisition and management. BMC Bioinformatics 6:16

    Article  Google Scholar 

  • Xu H et al (2005) SNPselector: A web tool for selecting SNPs for genetic association studies. Bioinformatics 21(22):4181–4186

    Article  CAS  PubMed  Google Scholar 

  • Yue P et al (2006) SNPs3D: Candidate gene and SNP selection for association studies. BMC Bioinformatics 7:166

    Article  PubMed  Google Scholar 

  • Zhang Y, Liu JS (2007) Bayesian inference of epistatic interactions incase-control studies. Nat Genet 39(9):1167–1173

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgments

We thank Yves Moreau for his insightful suggestion to apply the SNP study design system for prior generation in our Bayesian data analysis. Supported by grants from the OTKA National Scientific Research Fund (PD-76348); NKTH TECH_08-A1/2-2008-0120 (Genagrid), and the János Bolyai Research Scholarship of the Hungarian Academy of Sciences (P. Antal).

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Antal, P., Millinghoffer, A., Hullám, G., Hajós, G., Szalai, C., Falus, A. (2009). A Bioinformatic Platform for a Bayesian, Multiphased, Multilevel Analysis in Immunogenomics. In: Flower, D., Davies, M., Ranganathan, S. (eds) Bioinformatics for Immunomics. Immunomics Reviews:, vol 3. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-0540-6_11

Download citation

Publish with us

Policies and ethics