Skip to main content

Computational Methods for Integration of Biological Data

  • Chapter
  • First Online:
Personalized Medicine

Part of the book series: Europeanization and Globalization ((EAG,volume 2))

Abstract

As we keep accumulating large sets of diverse biological data, there is a need for efficient extraction of biological knowledge from the data. Developing computational tools for efficient integration of biological data, obtained from diverse experiments, has recently gained attention. Previous computational tools are mainly designed to analyze only one particular type of the biological data. Data of one type provide an incomplete and often obscure picture of cellular functioning. Analysis of these data separately can only partially address important biological questions, such as emergence of diseases and the development of novel diagnostic and therapeutic approaches. Therefore, a key strategy for deeper understanding of the functioning of a cell and better understanding of the molecular bases of human diseases is data integration. Here, we classify current integrative approaches and review their applications in addressing fundamental biological questions that can increase our understanding of a biological system.

Vladimir Gligorijević, Ph.D. Department of Computing, Imperial College London, UK.

Professor Nataša Pržulj, Ph.D., Department of Computing, University College London, UK.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Ito et al. (2000), Uetz et al. (2000), Giot et al. (2003), Li et al. (2004), Stelzl et al. (2005), Simonis et al. (2009), and Consortium AIM (2011).

  2. 2.

    Gavin et al. (2006) and Krogan et al. (2006).

  3. 3.

    Dahlquist et al. (2002) and Quackenbush (2001).

  4. 4.

    Marioni et al. (2008), Mortazavi et al. (2008), and Wang et al. (2009).

  5. 5.

    Wang et al. (2003) and Smith et al. (2006).

  6. 6.

    Vidal et al. (2011).

  7. 7.

    Pržulj (2011), Sarajlić and Pržulj (2014), Aittokallio and Schwikowski (2006).

  8. 8.

    Mccarthy et al. (2008), Sladek et al. (2007), and The Wellcome Trust Case Control Consortium (2007).

  9. 9.

    Ashburner et al. (2000).

  10. 10.

    Schriml et al. (2012).

  11. 11.

    Lu et al. (2005).

  12. 12.

    Wang et al. (2013).

  13. 13.

    Ma et al. (2013).

  14. 14.

    Zhang et al. (2011).

  15. 15.

    Hamburg and Collins (2010).

  16. 16.

    Dudley and Karczewski (2013).

  17. 17.

    Evers et al. (2012).

  18. 18.

    Gomez-Cabrero et al. (2014).

  19. 19.

    Ashburner et al. (2000).

  20. 20.

    Whisstock and Lesk (2003).

  21. 21.

    Ashburn and Thor (2004).

  22. 22.

    Hurle et al. (2013).

  23. 23.

    Wu et al. (2013).

  24. 24.

    Quinn et al. (2013).

  25. 25.

    Trenkwalder et al. (2004).

  26. 26.

    Hurle et al. (2013).

  27. 27.

    Ding et al. (2013) and Wang et al. (2013).

  28. 28.

    Wang et al. (2013), Yamanishi et al. (2008), Napolitano et al. (2013), and Huang et al. (2013).

  29. 29.

    Schriml et al. (2012) and Osborne et al. (2009).

  30. 30.

    Gatza et al. (2010).

  31. 31.

    Lee et al. (2008).

  32. 32.

    Žitnik et al. (2013).

  33. 33.

    Bromberg (2013).

  34. 34.

    Goh et al. (2007).

  35. 35.

    Köhler et al. (2008).

  36. 36.

    Vanunu et al. (2010).

  37. 37.

    Bebek et al. (2012).

  38. 38.

    Chen et al. (2013).

  39. 39.

    Linghu et al. (2009).

  40. 40.

    West (2000).

  41. 41.

    Albert (2005).

  42. 42.

    Barabasi and Oltvai (2004), Higham et al. (2008), and Pržulj et al. (2004).

  43. 43.

    West (2000).

  44. 44.

    Newman (2010) and Pržulj et al. (2004).

  45. 45.

    Barabasi and Oltvai (2004), Pržulj (2011), and Stelzl et al. (2005).

  46. 46.

    Consortium AIM (2011), Dreze et al. (2010), Giot et al. (2003), Ito et al. (2000), Li et al. (2004), Stelzl et al. (2005), and Uetz et al. (2000).

  47. 47.

    Gavin et al. (2006) and Krogan et al. (2006).

  48. 48.

    Chatr-Aryamontri et al. (2013).

  49. 49.

    Franceschini et al. (2013).

  50. 50.

    Keshava Prasad et al. (2009).

  51. 51.

    Schuster et al. (2000) and Vidal et al. (2011).

  52. 52.

    Kanehisa et al. (2012) and Zhou (2013).

  53. 53.

    Luo et al. (2007).

  54. 54.

    Prieto et al. (2008).

  55. 55.

    Ziv et al. (2003).

  56. 56.

    De Smet et al. (2002).

  57. 57.

    Luo et al. (2007).

  58. 58.

    Barrett et al. (2007).

  59. 59.

    Parkinson et al. (2005).

  60. 60.

    Hubble et al. (2009).

  61. 61.

    Costanzo et al. (2010), Mani et al. (2008), and Vidal et al. (2011).

  62. 62.

    Mani et al. (2008).

  63. 63.

    Vidal et al. (2011).

  64. 64.

    Chatr-Aryamontri et al. (2013).

  65. 65.

    Eungdamrong and Iyengar (2004).

  66. 66.

    Eungdamrong and Iyengar (2004) and Soyer et al. (2006).

  67. 67.

    Žitnik et al. (2013).

  68. 68.

    Schacherer et al. (2001).

  69. 69.

    Hamosh et al. (2005).

  70. 70.

    Davis et al. (2013).

  71. 71.

    Osborne et al. (2009).

  72. 72.

    Yildirim et al. (2007).

  73. 73.

    Daminelli et al. (2012), Wu et al. (2013), and Yamanishi et al. (2008).

  74. 74.

    Wishart et al. (2008).

  75. 75.

    Kanehisa et al. (2006).

  76. 76.

    Gunther et al. (2008).

  77. 77.

    Schomburg et al. (2013).

  78. 78.

    Yildirim et al. (2007).

  79. 79.

    Ding et al. (2013).

  80. 80.

    Wishart et al. (2008).

  81. 81.

    Kanehisa et al. (2006).

  82. 82.

    Kuhn et al. (2010).

  83. 83.

    Zhang et al. (2013).

  84. 84.

    Lamb et al. (2006).

  85. 85.

    Ashburner et al. (2000).

  86. 86.

    Schriml et al. (2012).

  87. 87.

    Ashburner et al. (2000).

  88. 88.

    Nelson et al. (2004).

  89. 89.

    Ayme et al. (2010).

  90. 90.

    Sioutos et al. (2007).

  91. 91.

    Cornet and de Keizer (2008).

  92. 92.

    Hamosh et al. (2005).

  93. 93.

    |V | stands for the number of elements in set V.

  94. 94.

    West (2000).

  95. 95.

    Semi-supervised learning is a class of machine learning techniques that uses small amounts of labeled data (prior information about data) for training an algorithm in performing a specific task. For example, in clustering tasks, traditional clustering algorithms use only unlabeled data to generate clusters, whereas semi-supervised clustering algorithms use prior information about the data to improve the clustering results. For instance, when performing clustering of genes, the supervision is generally given as pairwise constraints that guide the clustering process; such constrains are naturally given as molecular networks, that is, two connected genes in a molecular network are considered to belong to the same cluster.

  96. 96.

    Zhu et al. (2005).

  97. 97.

    Gligorijević et al. (2014).

  98. 98.

    Milenković and Pržulj (2008).

  99. 99.

    Ge et al. (2003).

  100. 100.

    Mani et al. (2008).

  101. 101.

    Žitnik et al. (2013).

  102. 102.

    Gligorijević et al. (2014).

  103. 103.

    de Silva et al. (2006) and Wodak et al. (2009).

  104. 104.

    Yu et al. (2011).

  105. 105.

    Ben-Gal (2007).

  106. 106.

    Yu et al. (2011).

  107. 107.

    Cooper and Herskovits (1992).

  108. 108.

    Dempster et al. (1977).

  109. 109.

    Ben-Gal (2007).

  110. 110.

    Fawcett (2006).

  111. 111.

    Nariai et al. (2007).

  112. 112.

    Linghu et al. (2009).

  113. 113.

    Schadt et al. (2012).

  114. 114.

    A cis eQTL is an eQTL that is located near the expressed gene.

  115. 115.

    Yu et al. (2011).

  116. 116.

    Boyd and Vandenberghe (2004).

  117. 117.

    Scholkopf and Smola (2001).

  118. 118.

    Lanckriet et al. (2004a, b).

  119. 119.

    Žitnik et al. (2013).

  120. 120.

    Wang et al. (2011).

  121. 121.

    Lee and Seung (1999).

  122. 122.

    Ding et al. (2006).

  123. 123.

    Wang et al. (2008).

  124. 124.

    Žitnik et al. (2013).

  125. 125.

    Koren et al. (2009).

  126. 126.

    Gligorijević et al. (2014).

  127. 127.

    Hwang et al. (2012).

  128. 128.

    Natarajan and Dhillon (2014).

References

  • Aerts S, Lambrechts D, Maity S, Loo PV, Coessens B, Smet FD, Tranchevent LC, Moor BD, Marynen P, Hassan B, Carmeliet P, Moreau Y (2006) Gene prioritization through genomic data fusion. Nat Biotechnol 24(5):537–544. doi:10.1038/nbt1203

    Article  Google Scholar 

  • Aittokallio T, Schwikowski B (2006) Graph-based methods for analysing networks in cell biology. Brief Bioinform 7(3). doi:10.1093/bib/bbl022

    Google Scholar 

  • Albert R (2005) Scale-free networks in cell biology. J Cell Sci 118(21):4947–4957. doi:10.1242/jcs.02714

    Article  Google Scholar 

  • Ashburn TT, Thor KB (2004) Drug repositioning: identifying and developing new uses for existing drugs. Nat Rev Drug Discov 3(8):673–683. doi:10.1038/nrd1468

    Article  Google Scholar 

  • Ashburner M, Ball CA, Blake JA et al (2000) Gene ontology: tool for the unification of biology. Nat Genet 25:25–29

    Article  Google Scholar 

  • Ayme S, Rath A, Bellet B (2010) Who international classification of diseases (ICD) revision process: incorporating rare diseases into the classification scheme: state of art. Orphanet J Rare Dis 5(Suppl 1):P1. doi:10.1186/1750-1172-5-S1-P1

    Article  Google Scholar 

  • Barabasi AL, Oltvai ZN (2004) Network biology: understanding the cell’s functional organization. Nat Rev Genet 5(2):101–113. doi:10.1038/nrg1272

    Article  Google Scholar 

  • Barrett T, Troup DB, Wilhite SE, Ledoux P, Rudnev D, Evangelista C, Kim IF, Soboleva A, Tomashevsky M, Edgar R (2007) NCBI geo: mining tens of millions of expression profiles database and tools update. Nucleic Acids Res 35(suppl 1):D760–D765. doi:10.1093/nar/gkl887

    Article  Google Scholar 

  • Bebek G, Koyutuerk M, Price ND, Chance MR (2012) Network biology methods integrating biological data for translational science. Brief Bioinform 13:446–459. doi:10.1093/bib/bbr075

    Google Scholar 

  • Ben-Gal I (2007) Bayesian networks. Wiley

    Google Scholar 

  • Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, New York

    Book  Google Scholar 

  • Bromberg Y (2013) Chapter 15: Disease gene prioritization. PLoS Comput Biol 9(4), e1002,902. doi:10.1371/journal.pcbi.1002902

    Article  Google Scholar 

  • Chatr-Aryamontri A, Breitkreutz BJ, Heinicke S et al (2013) The BioGRID interaction database. Nucleic Acids Res 41(D1):D816–D823. doi:10.1093/nar/gks1158

    Article  Google Scholar 

  • Chen Y, Wu X, Jiang R (2013) Integrating human OMICS data to prioritize candidate genes. BMC Med Genomics 6(1):57. doi:10.1186/1755-8794-6-57

    Article  Google Scholar 

  • Consortium AIM (2011) Evidence for network evolution in an arabidopsis interactome map. Science 333(6042):601–607. doi:10.1126/science.1203877

    Article  Google Scholar 

  • Cooper GF, Herskovits E (1992) A Bayesian method for the induction of probabilistic networks from data. Mach Learn 9(4):309–347. doi:10.1023/A:1022649401552

    Google Scholar 

  • Cornet R, de Keizer N (2008) Forty years of SNOMED: a literature review. BMC Med Inform Decis Mak 8(Suppl 1):S2. doi:10.1186/1472-6947-8-S1-S2

    Article  Google Scholar 

  • Costanzo M, Baryshnikova A, Bellay J et al (2010) The genetic landscape of a cell. Science 327(5964):425–431. doi:10.1126/science.1180823

    Article  Google Scholar 

  • Dahlquist KD, Salomonis N, Vranizan K, Lawlor SC, Conklin BR (2002) GenMAPP, a new tool for viewing and analyzing microarray data on biological pathways. Nat Genet 31(1):19–20. doi:10.1038/ng0502-19

    Google Scholar 

  • Daminelli S, Haupt VJ, Reimann M, Schroeder M (2012) Drug repositioning through incomplete bi-cliques in an integrated drug-target-disease network. Integr Biol 4:778–788. doi:10.1039/C2IB00154C

    Article  Google Scholar 

  • Davis AP, Murphy CG, Johnson R, Lay JM, Lennon-Hopkins K, Saraceni-Richards C, Sciaky D, King BL, Rosenstein MC, Wiegers TC, Mattingly CJ (2013) The comparative toxicogenomics database: update 2013. Nucl Acids Res 41(D1):D1104–D1114. doi:10.1093/nar/gks994

    Google Scholar 

  • de Silva E, Thorne T, Ingram P, Agrafioti I, Swire J, Wiuf C, Stumpf M (2006) The effects of incomplete protein interaction data on structural and evolutionary inferences. BMC Biol 4(1):39. doi:10.1186/1741-7007-4-39

    Article  Google Scholar 

  • De Smet F, Mathys J, Marchal K, Thijs G, De Moor B, Moreau Y (2002) Adaptive quality-based clustering of gene expression profiles. Bioinformatics 18(5):735–746. doi:10.1093/bioinformatics/18.5.735

    Article  Google Scholar 

  • Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em algorithm. J R Stat Soc Ser B 39(1):1–38

    Google Scholar 

  • Ding C, Li T, Peng W et al (2006) Orthogonal nonnegative matrix t-factorizations for clustering. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, New York, NY, USA, KDD’06, pp 126–135. doi:10.1145/1150402.1150420

  • Ding H, Takigawa I, Mamitsuka H, Zhu S (2013) Similarity-based machine learning methods for predicting drug-target interactions: a brief review. Brief Bioinform. doi:10.1093/bib/bbt056

    Google Scholar 

  • Dreze M, Monachello D, Lurin C, Cusick ME, Hill DE, Vidal M, Braun P (2010) Chapter 12 – high-quality binary interactome mapping. In: Guthrie JWC, Fink GR (eds) Guide to yeast genetics: functional genomics, proteomics, and other systems analysis, methods in enzymology, vol 470. Academic, pp 281–315. doi:10.1016/S0076-6879(10)70012-4

    Google Scholar 

  • Dudley J, Karczewski K (2013) Exploring personal genomics. OUP, Oxford

    Book  Google Scholar 

  • Dudley JT, Deshpande T, Butte AJ (2011) Exploiting drug-disease relationships for computational drug repositioning. Brief Bioinform. doi:10.1093/bib/bbr013

    Google Scholar 

  • Eungdamrong NJ, Iyengar R (2004) Modeling cell signaling networks. Biol Cell 96(5):355–362. doi:10.1016/j.biolcel.2004.03.004

    Google Scholar 

  • Evers AWM, Rovers MM, Kremer JAM, Veltman JA, Schalken JA, Bloem BR, van Gool AJ (2012) An integrated framework of personalized medicine: from individual genomes to participatory health care. Croat Med J 53(4):301–303

    Article  Google Scholar 

  • Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27(8):861–874. doi:10.1016/j.patrec.2005.10.010

    Article  Google Scholar 

  • Franceschini A, Szklarczyk D, Frankild S, Kuhn M, Simonovic M, Roth A, Lin J, Minguez P, Bork P, von Mering C, Jensen LJ (2013) String v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Res 41(D1):D808–D815. doi:10.1093/nar/gks1094

    Article  Google Scholar 

  • Gatza ML, Lucas JE, Barry WT, Kim JW, Wang Q, Crawford MD, Datto MB, Kelley M, Mathey-Prevot B, Potti A, Nevins JR (2010) A pathway-based classification of human breast cancer. Proc Natl Acad Sci U S A 107(15):6994–6999. doi:10.1073/pnas.0912708107

    Google Scholar 

  • Gavin A, Aloy P, Grandi P et al (2006) Proteome survey reveals modularity of the yeast cell machinery. Nature 440(7084):631–636. doi:10.1038/nature04532

    Article  Google Scholar 

  • Ge H, Walhout AJ, Vidal M (2003) Integrating OMIC information: a bridge between genomics and systems biology. Trend Genet 19(10):551–560. doi:10.1016/j.tig.2003.08.009

    Google Scholar 

  • Gevaert O, Smet FD, Timmerman D, Moreau Y, Moor BD (2006) Predicting the prognosis of breast cancer by integrating clinical and microarray data with Bayesian networks. Bioinformatics 22(14):e184–e190. doi:10.1093/bioinformatics/btl230

    Article  Google Scholar 

  • Giot L, Bader JS, Brouwer C, Chaudhuri A et al (2003) A protein interaction map of drosophila melanogaster. Science 302(5651):1727–1736. doi:10.1126/science.1090289

    Article  Google Scholar 

  • Gligorijević V, Janjić V, Pržulj N (2014) Integration of molecular network data reconstruct gene ontology. Bioinformatics 30(17):i594–i600. doi:10.1093/bioinformatics/btu470

    Article  Google Scholar 

  • Goh KI, Cusick ME, Valle D, Childs B, Vidal M, Barabsi AL (2007) The human disease network. Proc Natl Acad Sci 104(21):8685–8690. doi:10.1073/pnas.0701361104

    Article  Google Scholar 

  • Gomez-Cabrero D, Abugessaisa I, Maier D, Teschendorff A, Merkenschlager M, Gisel A, Ballestar E, Bongcam-Rudloff E, Conesa A, Tegnr J (2014) Data integration in the era of OMICS: current and future challenges. BMC Syst Biol 8(2):I1. doi:10.1186/1752-0509-8-S2-I1

    Article  Google Scholar 

  • Gunther S, Kuhn M, Dunkel M, Campillos M, Senger C, Petsalaki E, Ahmed J, Urdiales EG, Gewiess A, Jensen LJ, Schneider R, Skoblo R, Russell RB, Bourne PE, Bork P, Preissner R (2008) Supertarget and matador: resources for exploring drug-target relationships. Nucleic Acids Res 36(Suppl 1):D919–D922. doi:10.1093/nar/gkm862

    Google Scholar 

  • Hamburg MA, Collins FS (2010) The path to personalized medicine. N Engl J Med 363(4):301–304. doi:10.1056/NEJMp1006304

    Article  Google Scholar 

  • Hamosh A, Scott AF, Amberger JS, Bocchini CA, McKusick VA (2005) Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res 33(Suppl 1):D514–D517. doi:10.1093/nar/gki033

    Google Scholar 

  • Higham DJ, Rašajski M, Pržulj N (2008) Fitting a geometric graph to a protein–protein interaction network. Bioinformatics 24(8):1093–1099. doi:10.1093/bioinformatics/btn079

    Article  Google Scholar 

  • Huang YF, Yeh HY, Soo VW (2013) Inferring drug-disease associations from integration of chemical, genomic and phenotype data using network propagation. BMC Med Genomics 6(3):1–14. doi:10.1186/1755-8794-6-S3-S4

    Google Scholar 

  • Hubble J, Demeter J, Jin H et al (2009) Implementation of genepattern within the Stanford microarray database. Nucleic Acids Res 37(1):D898–D901. doi:10.1093/nar/gkn786

    Article  Google Scholar 

  • Hurle MR, Yang L, Xie Q, Rajpal DK, Sanseau P, Agarwal P (2013) Computational drug repositioning: from data to therapeutics. Clin Pharmacol Ther 93(4):335–341. doi:10.1038/clpt.2013.1

    Article  Google Scholar 

  • Hwang T, Atluri G, Xie M, Dey S, Hong C, Kumar V, Kuang R (2012) Co-clustering phenomegenome for phenotype classification and disease gene discovery. Nucleic Acids Res 40(19), e146. doi:10.1093/nar/gks615

    Article  Google Scholar 

  • Ito T, Tashiro K, Muta S et al (2000) Toward a protein-protein interaction map of the budding yeast: a comprehensive system to examine two-hybrid interactions in all possible combinations between the yeast proteins. Proc Natl Acad Sci 97(3):1143–1147. doi:10.1073/pnas.97.3.1143

    Article  Google Scholar 

  • Kanehisa M, Goto S, Hattori M, Aoki-Kinoshita KF, Itoh M, Kawashima S, Katayama T, Araki M, Hirakawa M (2006) From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res 34(Suppl 1):D354–D357. doi:10.1093/nar/gkj102

    Article  Google Scholar 

  • Kanehisa M, Goto S, Sato Y, Furumichi M, Tanabe M (2012) KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Res 40(D1):D109–D114. doi:10.1093/nar/gkr988

    Article  Google Scholar 

  • Keshava Prasad TS, Goel R, Kandasamy K, Keerthikumar S, Kumar S, Mathivanan S, Telikicherla D, Raju R, Shafreen B, Venugopal A, Balakrishnan L, Marimuthu A, Banerjee S, Somanathan DS, Sebastian A, Rani S, Ray S, Harrys Kishore CJ, Kanth S, Ahmed M, Kashyap MK, Mohmood R, Ramachandra YL, Krishna V, Rahiman BA, Mohan S, Ranganathan P, Ramabadran S, Chaerkady R, Pandey A (2009) Human protein reference database 2009 update. Nucleic Acids Res 37(Suppl 1):D767–D772. doi:10.1093/nar/gkn892

    Article  Google Scholar 

  • Köhler S, Bauer S, Horn D, Robinson PN (2008) Walking the interactome for prioritization of candidate disease genes. Am J Hum Genet 82(4):949–958. doi:10.1016/j.ajhg.2008.02.013

    Article  Google Scholar 

  • Koren Y, Bell R, Volinsky C (2009) Matrix factorization techniques for recommender systems. Computer 42(8):30–37. doi:10.1109/MC.2009.263

    Article  Google Scholar 

  • Krogan N, Cagney G, Yu H, Zhong G et al (2006) Global landscape of protein complexes in the yeast saccharomyces cerevisiae. Nature 440:637–643. doi:10.1038/nature04670

    Article  Google Scholar 

  • Kuhn M, Campillos M, Letunic I, Jensen LJ, Bork P (2010) A side effect resource to capture phenotypic effects of drugs. Mol Syst Biol 6(1). doi:10.1038/msb.2009.98

  • Lamb J, Crawford ED, Peck D, Modell JW, Blat IC, Wrobel MJ, Lerner J, Brunet JP, Subramanian A, Ross KN, Reich M, Hieronymus H, Wei G, Armstrong SA, Haggarty SJ, Clemons PA, Wei R, Carr SA, Lander ES, Golub TR (2006) The connectivity map: using gene-expression signatures to connect small molecules, genes, and disease. Science 313(5795):1929–1935. doi:10.1126/science.1132939

    Article  Google Scholar 

  • Lanckriet G, Deng M, Cristianini N, Jordan M, Noble W (2004) Kernel-based data fusion and its application to protein function prediction in yeast. In: Biocomputing 2004, Proceedings of the Pacific Symposium, Hawaii, USA pp 300–311, iSBN: 9812385983 Publisher: World Scientific Name and Venue of Conference: Biocomputing 2004, Proceedings of the Pacific Symposium, Hawaii, USA Other identifier: 2000790

    Google Scholar 

  • Lanckriet GRG, De Bie T, Cristianini N et al (2004b) A statistical framework for genomic data fusion. Bioinformatics 20(16):2626–2635. doi:10.1093/bioinformatics/bth294

    Article  Google Scholar 

  • Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791

    Google Scholar 

  • Lee I, Date SV, Adai AT, Marcotte EM (2004) A probabilistic functional network of yeast genes. Science 306(5701):1555–1558. doi:10.1126/science.1099511

    Article  Google Scholar 

  • Lee DS, Park J, Kay KA, Christakis NA, Oltvai ZN, Barabsi AL (2008) The implications of human metabolic network topology for disease comorbidity. Proc Natl Acad Sci 105(29):9880–9885. doi:10.1073/pnas.0802208105

    Article  Google Scholar 

  • Li S, Armstrong C, Bertin N et al (2004) A map of the interactome network of the metazoan c. elegans. Science 303(5657):540–543. doi:10.1126/science.1091403

    Article  Google Scholar 

  • Linghu B, Snitkin E, Hu Z, Xia Y, DeLisi C (2009) Genome-wide prioritization of disease genes and identification of disease-disease associations from an integrated human functional linkage network. Genome Biol 10(9):R91. doi:10.1186/gb-2009-10-9-r91

    Article  Google Scholar 

  • Lu LJ, Xia Y, Paccanaro A, Yu H, Gerstein M (2005) Assessing the limits of genomic data integration for predicting protein networks. Genome Res 15(7):945–953. doi:10.1101/gr.3610305

    Article  Google Scholar 

  • Luo F, Yang Y, Zhong J, Gao H, Khan L, Thompson D, Zhou J (2007) Constructing gene co-expression networks and predicting functions of unknown genes by random matrix theory. BMC Bioinform 8(1):299. doi:10.1186/1471-2105-8-299

    Article  Google Scholar 

  • Ma X, Chen T, Sun F (2013) Integrative approaches for predicting protein function and prioritizing genes for complex phenotypes using protein interaction networks. Brief Bioinform. doi:10.1093/bib/bbt041

    Google Scholar 

  • Mani R, StOnge RP, Hartman JL, Giaever G, Roth FP (2008) Defining genetic interaction. Proc Natl Acad Sci 105(9):3461–3466. doi:10.1073/pnas.0712255105

    Article  Google Scholar 

  • Marioni JC, Mason CE, Mane SM, Stephens M, Gilad Y (2008) RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res 18:1509–1517

    Article  Google Scholar 

  • Mccarthy MI, Abecasis GR, Cardon LR, Goldstein DB, Little J, Ioannidis JPA, Hirschhorn JN (2008) Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet 9(5):356–369. doi:10.1038/nrg2344

    Article  Google Scholar 

  • Milenković T, Pržulj N (2008) Uncovering Biological Network Function via Graphlet Degree Signatures. Cancer Inform 6:00. doi:10.4137/CIN.S680

    Google Scholar 

  • Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-seq. Nat Meth 5(7):621–628. doi:10.1038/nmeth.1226

    Google Scholar 

  • Napolitano F, Zhao Y, Moreira VM, Tagliaferri R, Kere J, D’Amato M, Greco D (2013) Drug repositioning: a machine-learning approach through data integration. J Cheminform 5:30

    Article  Google Scholar 

  • Nariai N, Kolaczyk ED, Kasif S (2007) Probabilistic protein function prediction from heterogeneous genome-wide data. PLoS One 2(3):e337. doi:10.1371/journal.pone. 0000337

  • Natarajan N, Dhillon IS (2014) Inductive matrix completion for predicting gene-disease associations. Bioinformatics 30(12):i60–i68. doi:10.1093/bioinformatics/btu269

    Article  Google Scholar 

  • Nelson S, Schopen M, Savage A, Schulman J, Arluk N (2004) The mesh translation maintenance system: structure, interface design, and implementation. In: Proceedings of the 11th World Congress on Medical Informatics, IOS Press, pp 67–69

    Google Scholar 

  • Newman M (2010) Networks: an introduction. Oxford University Press, Inc., New York

    Book  Google Scholar 

  • Ooi SL, Shoemaker DD, Boeke JD (2003) DNA helicase gene interaction network defined using synthetic lethality analyzed by microarray. Nat Genet 35(3):277–286

    Article  Google Scholar 

  • Osborne J, Flatow J, Holko M, Lin S, Kibbe W, Zhu L, Danila M, Feng G, Chisholm R (2009) Annotating the human genome with disease ontology. BMC Genomics 10(Suppl 1):S6. doi:10.1186/1471-2164-10-S1-S6

    Article  Google Scholar 

  • Parkinson H, Sarkans U, Shojatalab M, Abeygunawardena N, Contrino S, Coulson R, Farne A, Garcia Lara G, Holloway E, Kapushesky M, Lilja P, Mukherjee G, Oezcimen A, Rayner T, Rocca-Serra P, Sharma A, Sansone S, Brazma A (2005) ArrayExpress – a public repository for microarray gene expression data at the ebi. Nucleic Acids Res 33(Suppl 1):D553–D555. doi:10.1093/nar/gki056

    Google Scholar 

  • Prieto C, Risueo A, Fontanillo C, De Las Rivas J (2008) Human gene coexpression landscape: confident network derived from tissue transcriptomic profiles. PLoS One 3(12), e3911. doi:10.1371/journal.pone.0003911

    Article  Google Scholar 

  • Pržulj N (2011) Protein-protein interactions: making sense of networks via graph-theoretic modeling. Bioessays 33(2):115–123. doi:10.1002/bies.201000044

    Article  Google Scholar 

  • Pržulj N, Corneil DG, Jurisica I (2004) Modeling interactome: scale-free or geometric? Bioinformatics 20(18):3508–3515. doi:10.1093/bioinformatics/bth436

    Article  Google Scholar 

  • Quackenbush J (2001) Computational analysis of microarray data. Nat Rev Genet 2(6):418–427. doi:10.1038/35076576

    Google Scholar 

  • Quinn BJ, Kitagawa H, Memmott RM, Gills JJ, Dennis PA (2013) Repositioning metformin for cancer prevention and treatment. Trends Endocrinol Metab 24(9):469–480

    Article  Google Scholar 

  • Sarajlić A, Pržulj N (2014) Survey of network-based approaches to research of cardiovascular diseases. Biomed Res Int 2014:527,029

    Google Scholar 

  • Schacherer F, Choi C, Gtze U, Krull M, Pistor S, Wingender E (2001) The transpath signal transduction database: a knowledge base on signal transduction networks. Bioinformatics 17(11):1053–1057. doi:10.1093/bioinformatics/17.11.1053

    Article  Google Scholar 

  • Schadt EE, Woo S, Hao K (2012) Bayesian method to predict individual SNP genotypes from gene expression data. Nat Genet 44(5):603–608. doi:10.1038/ng.2248

    Google Scholar 

  • Scholkopf B, Smola AJ (2001) Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT Press, Cambridge

    Google Scholar 

  • Schomburg I, Chang A, Placzek S, Shngen C, Rother M, Lang M, Munaretto C, Ulas S, Stelzer M, Grote A, Scheer M, Schomburg D (2013) BRENDA in 2013: integrated reactions, kinetic data, enzyme function data, improved disease classification: new options and contents in BRENDA. Nucleic Acids Res 41(D1):D764–D772. doi:10.1093/nar/gks1049

    Article  Google Scholar 

  • Schriml LM, Arze C, Nadendla S, Chang YWW, Mazaitis M, Felix V, Feng G, Kibbe WA (2012) Disease ontology: a backbone for disease semantic integration. Nucleic Acids Res 40(D1):D940–D946. doi:10.1093/nar/gkr972

    Article  Google Scholar 

  • Schuster S, Fell DA, Dandekar T (2000) A general definition of metabolic pathways useful for systematic organization and analysis of complex metabolic networks. Nat Biotechnol 18(3):326332. doi:10.1038/73786

    Google Scholar 

  • Sharan R, Ulitsky I, Shamir R (2007) Network-based prediction of protein function. Mol Syst Biol 3(1). doi:10.1038/msb4100129

  • Simonis N, Rual JFF, Carvunis ARR et al (2009) Empirically controlled mapping of the Caenorhabditis elegans protein-protein interactome network. Nat Methods 6(1):47–54

    Article  Google Scholar 

  • Sioutos N, de Coronado S, Haber MW, Hartel FW, Shaiu WL, Wright LW (2007) NCI thesaurus: a semantic model integrating cancer-related clinical and molecular information. J Biomed Inform 40(1):30–43. doi:10.1186/1750-1172-5-S1-P1

    Article  Google Scholar 

  • Sladek R, Rocheleau G, Rung J, Dina C, Shen L, Serre D, Boutin P, Vincent D, Belisle A, Hadjadj S (2007) A genome-wide association study identifies novel risk loci for type 2 diabetes. Nature 445(7130):881885. doi:10.1038/nature05616

    Article  Google Scholar 

  • Smith CA, Want EJ, O’Maille G, Abagyan R, Siuzdak G (2006) XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Anal Chem 78(3):779–787. doi:10.1021/ac051437y

    Article  Google Scholar 

  • Soyer OS, Salath M, Bonhoeffer S (2006) Signal transduction networks: topology, response and biochemical processes. J Theor Biol 238(2):416–425. doi:10.1016/j.jtbi.2005.05.030

    Google Scholar 

  • Stelzl U, Worm U, Lalowski M et al (2005) A human protein-protein interaction network: a resource for annotating the proteome. Cell 122(6):957–968. doi:10.1016/j.cell.2005.08.029

    Google Scholar 

  • The Wellcome Trust Case Control Consortium (2007) Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447(7145):661–678. doi:10.1038/nature05911

    Article  Google Scholar 

  • Tong AH, Evangelista M, Parsons AB, Xu H, Bader GD, Pagé N, Robinson M, Raghibizadeh S, Hogue CW, Bussey H, Andrews B, Tyers M, Boone C (2001) Systematic genetic analysis with ordered arrays of yeast deletion mutants. Science 294(5550):2364–2368. doi:10.1126/science.1065810

    Article  Google Scholar 

  • Trenkwalder C, Garcia-Borreguero D, Montagna P, Lainey E, de Weerd AW, Tidswell P, Saletu-Zyhlarz G, Telstad W, Ferini-Strambi L (2004) Ropinirole in the treatment of restless legs syndrome: results from the treat rls 1 study, a 12 week, randomised, placebo controlled study in 10 European countries. J Neurol Neurosurg Psychiatry 75(1):92–97

    Google Scholar 

  • Troyanskaya OG, Dolinski K, Owen AB, Altman RB, Botstein D (2003) A Bayesian framework for combining heterogeneous data sources for gene function prediction (in saccharomyces cerevisiae). Proc Natl Acad Sci 100(14):8348–8353. doi:10.1073/pnas.0832373100

    Article  Google Scholar 

  • Uetz P, Giot L, Cagney G et al (2000) A comprehensive analysis of protein-protein interactions in saccharomyces cerevisiae. Nature 403(6770):623–627. doi:10.1038/35001009

    Article  Google Scholar 

  • Vanunu O, Magger O, Ruppin E, Shlomi T, Sharan R (2010) Associating genes and protein complexes with disease via network propagation. PLoS Comput Biol 6(1), e1000,641. doi:10.1371/journal.pcbi.1000641

    Article  Google Scholar 

  • Vidal M, Cusick ME, Barabási AL (2011) Interactome networks and human disease. Cell 144(6):986–998

    Article  Google Scholar 

  • Wang W, Zhou H, Lin H, Roy S, Shaler TA, Hill LR, Norton S, Kumar P, Anderle M, Becker CH (2003) Quantification of proteins and metabolites by mass spectrometry without isotopic labeling or spiked standards. Anal Chem 75(18):4818–4826. doi:10.1021/ac026468x

    Google Scholar 

  • Wang F, Li T, Zhang C (2008) Semi-supervised clustering via matrix factorization. In: SDM, SIAM, pp 1–12

    Google Scholar 

  • Wang Z, Gerstein M, Snyder M (2009) RNA-seq: a revolutionary tool for transcriptomics. Nat Rev Gen 10(1):57–63. doi:10.1038/nrg2484

    Google Scholar 

  • Wang H, Huang H, Ding C (2011) Simultaneous clustering of multi-type relational data via symmetric nonnegative matrix tri-factorization. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, ACM, New York, NY, USA, CIKM’11, pp 279–284. doi:10.1145/2063576.2063621

  • Wang Y, Chen S, Deng N, Wang Y (2013) Drug repositioning by kernel-based integration of molecular structure, molecular activity, and phenotype data. PLoS One 8(11), e78,518. doi:10.1371/journal.pone.0078518

    Article  Google Scholar 

  • West DB (2000) Introduction to graph theory, 2nd edn., Prentice Hall

    Google Scholar 

  • Whisstock JC, Lesk AM (2003) Prediction of protein function from protein sequence and structure. Q Rev Biophys 36:307–340. doi:10.1017/S0033583503003901

    Article  Google Scholar 

  • Wishart DS, Knox C, Guo AC, Cheng D, Shrivastava S, Tzur D, Gautam B, Assanali M (2008) Drugbank: a knowledgebase for drugs, drug actions and drug targets. Nucl Acids Res 36(Suppl 1):D901–D906. doi:10.1093/nar/gkm958

    Google Scholar 

  • Wodak SJ, Pu S, Vlasblom J, Sraphin B (2009) Challenges and rewards of interaction proteomics. Mol Cell Proteomics 8(1):3–18. doi:10.1074/mcp.R800014-MCP200

    Article  Google Scholar 

  • Wu Z, Wang Y, Chen L (2013) Network-based drug repositioning. Mol Biosyst 9:1268–1281. doi:10.1039/C3MB25382A

    Article  Google Scholar 

  • Yamanishi Y, Araki M, Gutteridge A, Honda W, Kanehisa M (2008) Prediction of drug-target interaction networks from the integration of chemical and genomic spaces. Bioinformatics 24(13):i232–i240. doi:10.1093/bioinformatics/btn162

    Article  Google Scholar 

  • Yildirim MA, Goh KI, Cusick ME, Barabási AL, Vidal M (2007) Drug-target network. Nat Biotechnol 25(10):1119–1126

    Google Scholar 

  • Yu S, Tranchevent LC, Moor BD, Moreau Y (2011) Kernel-based data fusion for machine learning – methods and applications in bioinformatics and text mining, studies in computational intelligence, vol 345. Springer

    Google Scholar 

  • Zhang S, Li Q, Liu J et al (2011) A novel computational framework for simultaneous integration of multiple types of genomic data to identify microrna-gene regulatory modules. Bioinformatics 27(13):i401–i409. doi:10.1093/bioinformatics/btr206

    Article  Google Scholar 

  • Zhang P, Agarwal P, Obradovic Z (2013) Computational drug repositioning by ranking and integrating multiple data sources. In: Blockeel H, Kersting K, Nijssen S, elezn F (eds) Machine learning and knowledge discovery in databases, Lecture Notes in Computer Science, vol 8190. Springer, Heidelberg, pp 579–594. doi:10.1007/978-3-642-40994-3_37

    Google Scholar 

  • Zhou T (2013) Computational reconstruction of metabolic networks from KEGG. In: Reisfeld B, Mayeno AN (eds) Computational toxicology, methods in molecular biology, vol 930. Humana Press, pp 235–249. doi:10.1007/978-1-62703-059-5_10

    Google Scholar 

  • Zhu D, Hero AO, Cheng H et al (2005) Network constrained clustering for gene microarray data. Bioinformatics 21(21):4014–4020. doi:10.1093/bioinformatics/bti655

    Article  Google Scholar 

  • Žitnik M, Janjić V, Chris L et al (2013) Discovering disease-disease associations by fusing systems-level molecular data. Sci Rep 3(3202). doi:10.1038/srep03202

  • Ziv ZB, Georg GK, Tong TI, Nicola NJ, Jane JY, Francois F, Benjamin DB, Ernest E, Tommi TS, Richard RA, David DK (2003) Computational discovery of gene modules and regulatory networks. Nat Biotechnol 21(11):1337–1342

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by the European Research Council (ERC) Starting Independent Researcher Grant 278212, the National Science Foundation (NSF) Cyber-Enabled Discovery and Innovation (CDI) OIA-1028394, the Serbian Ministry of Education and Science Project III44006, and ARRS project J1-5454.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nataša Pržulj Ph.D. .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Gligorijević, V., Pržulj, N. (2016). Computational Methods for Integration of Biological Data. In: Bodiroga-Vukobrat, N., Rukavina, D., Pavelić, K., Sander, G. (eds) Personalized Medicine. Europeanization and Globalization, vol 2. Springer, Cham. https://doi.org/10.1007/978-3-319-39349-0_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-39349-0_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-39347-6

  • Online ISBN: 978-3-319-39349-0

  • eBook Packages: Law and CriminologyLaw and Criminology (R0)

Publish with us

Policies and ethics