Data Fusion Approach for Learning Transcriptional Bayesian Networks

  • Elisabetta SautaEmail author
  • Andrea Demartini
  • Francesca Vitali
  • Alberto Riva
  • Riccardo Bellazzi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10259)


The complexity of gene expression regulation relies on the synergic nature underlying the molecular interplay among its principal actors, transcription factors (TFs). Exerting a spatiotemporal control on their target genes, they define transcriptional programs across the genome, which are strongly perturbed in a disease context. In order to gain a more comprehensive picture of these complex dynamics, a data fusion approach, aimed at performing the integration of heterogeneous -omics data is fundamental.

Bayesian Networks provide a natural framework for integrating different sources of data and knowledge through the priors’ use. In this work, we developed an hybrid structure-learning algorithm with the aim of exploiting TF ChIP-seq and gene expression (GE) data to investigate disease-specific transcriptional regulations in a genome-wide perspective. TF ChIP seq profiles were firstly used for structure learning and then integrated in the model as a prior probability. GE panels were employed to learn the model parameters, trying to find the best heuristic transcriptional network. We applied our approach to a specific pathological case, the chronic myeloid leukemia (CML), a myeloproliferative disorder, whose transcriptional mechanisms have not yet been deeply elucidated.

The proposed data-driven method allows to investigate transcriptional signatures, highlighting in the obtained probabilistic network a three-layered hierarchy, as a different TFs influence on gene expression cellular programs.


Bayesian networks Transcriptional regulations -omics data integration 


  1. 1.
    Jensen, F.V.: Introduction to Bayesian Networks. Springer, Secaucus (1996)Google Scholar
  2. 2.
    Hartemink, A., Gifford, D., Jaakkols, T., et al.: Combining location and expression data for principled discovery of genetic regulatory network models. PSB 7, 437–449 (2002)Google Scholar
  3. 3.
    Perrier, E., Imoto, S., Miyano, S.: Finding optimal Bayesian network given a super-structure. JMLR 9, 2251–2286 (2008)MathSciNetzbMATHGoogle Scholar
  4. 4.
    Kojima, K., Perrier, E., Imoto, S., et al.: Optimal search on clustered structural constraint for learning Bayesian network structure. JMLR 11, 285–310 (2010)MathSciNetzbMATHGoogle Scholar
  5. 5.
    Sikora, W., Ackermann, M., Christodoulou, E., et al.: Assessing computational methods for TF target gene identification based on ChIP-seq data. PLoS Comput. Biol. 9(11), e1003342 (2013)CrossRefGoogle Scholar
  6. 6.
    Friedman, N., Linial, M., Nachman, I.: Bayesian networks to analyze expression data. J. Comput. Biol. 7(3–4), 601–620 (2000)CrossRefGoogle Scholar
  7. 7.
    Friedman, N.: Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3), 432–441 (2008)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Elisabetta Sauta
    • 1
    Email author
  • Andrea Demartini
    • 1
  • Francesca Vitali
    • 2
  • Alberto Riva
    • 3
  • Riccardo Bellazzi
    • 1
  1. 1.Department of Electrical, Computer and Biomedical EngineeringUniversity of PaviaPaviaItaly
  2. 2.Center of Biomedical Informatics and BiostatisticsUniversity of ArizonaTucsonUSA
  3. 3.Interdisciplinary Center for Biotechnology ResearchUniversity of FloridaGainesvilleUSA

Personalised recommendations