Strategic Integration of Multiple Bioinformatics Resources for System Level Analysis of Biological Networks

D’Souza, Mark; Sulakhe, Dinanath; Wang, Sheng; Xie, Bing; Hashemifar, Somaye; Taylor, Andrew; Dubchak, Inna; Conrad Gilliam, T.; Maltsev, Natalia

doi:10.1007/978-1-4939-7027-8_5

Mark D’Souza^4,5,
Dinanath Sulakhe^4,6,
Sheng Wang^4,7,
Bing Xie^4,8,
Somaye Hashemifar⁷,
Andrew Taylor⁴,
Inna Dubchak⁹,
T. Conrad Gilliam^4,6 &
…
Natalia Maltsev^4,6

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1613))

2196 Accesses
4 Citations

Abstract

Recent technological advances in genomics allow the production of biological data at unprecedented tera- and petabyte scales. Efficient mining of these vast and complex datasets for the needs of biomedical research critically depends on a seamless integration of the clinical, genomic, and experimental information with prior knowledge about genotype-phenotype relationships. Such experimental data accumulated in publicly available databases should be accessible to a variety of algorithms and analytical pipelines that drive computational analysis and data mining.

We present an integrated computational platform Lynx (Sulakhe et al., Nucleic Acids Res 44:D882–D887, 2016) ( http://lynx.cri.uchicago.edu ), a web-based database and knowledge extraction engine. It provides advanced search capabilities and a variety of algorithms for enrichment analysis and network-based gene prioritization. It gives public access to the Lynx integrated knowledge base (LynxKB) and its analytical tools via user-friendly web services and interfaces. The Lynx service-oriented architecture supports annotation and analysis of high-throughput experimental data. Lynx tools assist the user in extracting meaningful knowledge from LynxKB and experimental data, and in the generation of weighted hypotheses regarding the genes and molecular mechanisms contributing to human phenotypes or conditions of interest. The goal of this integrated platform is to support the end-to-end analytical needs of various translational projects.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Softcover Book: USD 159.00; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Chen J et al (2013) Translational biomedical informatics in the cloud: present and future. Biomed Res Int 2013:658925
PubMed PubMed Central Google Scholar
Payne PR, Embi PJ, Sen CK (2009) Translational informatics: enabling high-throughput research paradigms. Physiol Genomics 39(3):131–140
Article PubMed PubMed Central Google Scholar
Ranganathan S et al (2011) Towards big data science in the decade ahead from ten years of InCoB and the 1st ISCB-Asia Joint Conference. BMC Bioinformatics 12(Suppl 13):S1
Article PubMed PubMed Central Google Scholar
Boyd LB et al (2011) The caBIG^® Life Science Business Architecture model. Bioinformatics 27(10):1429–1435
Article CAS PubMed Central Google Scholar
Hillman-Jackson, J., et al. (2012) Using Galaxy to perform large-scale interactive data analyses. Curr Protoc Bioinformatics. Chapter 10: p. Unit10.5.
Google Scholar
Schuler R et al (2012) A flexible, open, decentralized system for digital pathology networks. Stud Health Technol Inform 175:29–38
PubMed PubMed Central Google Scholar
Ideker T, Krogan NJ (2012) Differential network biology. Mol Syst Biol 8:565
Article PubMed PubMed Central Google Scholar
Koyutürk M (2010) Algorithmic and analytical methods in network biology. Wiley Interdiscip Rev Syst Biol Med 2(3):277–292
Article PubMed PubMed Central Google Scholar
Bandyopadhyay S et al (2010) Rewiring of genetic networks in response to DNA damage. Science 330(6009):1385–1389
Article CAS PubMed PubMed Central Google Scholar
Chikina MD et al (2009) Global prediction of tissue-specific gene expression and context-dependent gene networks in Caenorhabditis elegans. PLoS Comput Biol 5(6):e1000417
Article PubMed PubMed Central Google Scholar
Myers CL, Troyanskaya OG (2007) Context-sensitive data integration and prediction of biological networks. Bioinformatics 23(17):2322–2330
Article CAS PubMed Google Scholar
Sharan R, Ideker T (2006) Modeling cellular machinery through biological network comparison. Nat Biotechnol 24(4):427–433
Article CAS PubMed Google Scholar
Takemoto K, Kihara K (2013) Modular organization of cancer signaling networks is associated with patient survivability. Biosystems 113(3):149–154
Article CAS PubMed Google Scholar
Ideker T, Sharan R (2008) Protein networks in disease. Genome Res 18(4):644–652
Article CAS PubMed PubMed Central Google Scholar
Kiemer L, Cesareni G (2007) Comparative interactomics: comparing apples and pears? Trends Biotechnol 25(10):448–454
Article CAS PubMed Google Scholar
Nibbe RK et al (2011) Protein-protein interaction networks and subnetworks in the biology of disease. Wiley Interdiscip Rev Syst Biol Med 3(3):357–367
Article CAS PubMed Google Scholar
Blank MC et al (2011) Multiple developmental programs are altered by loss of Zic1 and Zic4 to cause Dandy-Walker malformation cerebellar pathogenesis. Development 138(6):1207–1216
Article CAS PubMed PubMed Central Google Scholar
Beltrao P, Ryan C, Krogan NJ (2012) Comparative interaction networks: bridging genotype to phenotype. Adv Exp Med Biol 751:139–156
Article CAS PubMed PubMed Central Google Scholar
Black DL, Grabowski PJ (2003) Alternative pre-mRNA splicing and neuronal function. Prog Mol Subcell Biol 31:187–216
Article CAS Google Scholar
Ellis JD et al (2012) Tissue-specific alternative splicing remodels protein-protein interaction networks. Mol Cell 46(6):884–892
Article CAS Google Scholar
Greene CS et al (2015) Understanding multicellular function and disease with human tissue-specific networks. Nat Genet 47(6):569–576
Article CAS PubMed PubMed Central Google Scholar
Yap K, Makeyev EV (2013) Regulation of gene expression in mammalian nervous system through alternative pre-mRNA splicing coupled with RNA quality control mechanisms. Mol Cell Neurosci 56:420–428
Article CAS PubMed Google Scholar
Biamonti G et al (2014) The alternative splicing side of cancer. Semin Cell Dev Biol 32:30–36
Article CAS PubMed Google Scholar
Kaida D, Schneider-Poetsch T, Yoshida M (2012) Splicing in oncogenesis and tumor suppression. Cancer Sci 103(9):1611–1616
Article CAS PubMed Google Scholar
Zhang J, Manley JL (2013) Misregulation of pre-mRNA alternative splicing in cancer. Cancer Discov 3(11):1228–1237
Article CAS PubMed Google Scholar
Wells QS et al (2013) Whole exome sequencing identifies a causal RBM20 mutation in a large pedigree with familial dilated cardiomyopathy. Circ Cardiovasc Genet 6(4):317–326
Article CAS PubMed PubMed Central Google Scholar
Stallings-Mann M, Radisky D (2007) Matrix metalloproteinase-induced malignancy in mammary epithelial cells. Cells Tissues Organs 185(1–3):104–110
Article CAS PubMed Google Scholar
Sumithra B, Saxena U, Das AB (2016) Alternative splicing within the Wnt signaling pathway: role in cancer development. Cell Oncol (Dordr) 39(1):1–13
Article CAS Google Scholar
Yabas M, Elliott H, Hoyne GF (2016) The role of alternative splicing in the control of immune homeostasis and cellular differentiation. Int J Mol Sci 17(1):3
Article Google Scholar
Schaefer MH et al (2013) Adding protein context to the human protein-protein interaction network to reveal meaningful interactions. PLoS Comput Biol 9(1):e1002860
Article CAS PubMed PubMed Central Google Scholar
Shao H et al (2013) Systematically studying kinase inhibitor induced signaling network signatures by integrating both therapeutic and side effects. PLoS One 8(12):e80832
Article PubMed PubMed Central Google Scholar
Cordero F et al (2012) Large disclosing the nature of computational tools for the analysis of next generation sequencing data. Curr Top Med Chem 12(12):1320–1330
Article CAS PubMed Google Scholar
Hong H et al (2013) Critical role of bioinformatics in translating huge amounts of next-generation sequencing data into personalized medicine. Sci China Life Sci 56(2):110–118
Article CAS PubMed Google Scholar
Wang S, Xing J (2013) A primer for disease gene prioritization using next-generation sequencing data. Genomics Inform 11(4):191–199
Article PubMed PubMed Central Google Scholar
Warde-Farley D et al (2010) The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function. Nucleic Acids Res 38(Web Server issue):W214–W220
Article CAS PubMed PubMed Central Google Scholar
Franceschini A et al (2013) STRING v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Res 41(Database issue):D808–D815
Article CAS PubMed Google Scholar
Szklarczyk D et al (2011) The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic Acids Res 39(Database issue):D561–D568
Article CAS PubMed Google Scholar
Chen J et al (2009) ToppGene Suite for gene list enrichment analysis and candidate gene prioritization. Nucleic Acids Res 37(Web Server issue):W305–W311
Article CAS PubMed PubMed Central Google Scholar
Tranchevent LC et al (2008) ENDEAVOUR update: a web resource for gene prioritization in multiple species. Nucleic Acids Res 36(Web Server issue):W377–W384
Article CAS PubMed PubMed Central Google Scholar
Sifrim A et al (2013) eXtasy: variant prioritization by genomic data fusion. Nat Methods 10(11):1083–1084
Article CAS PubMed Google Scholar
Wu J, Li Y, Jiang R (2014) Integrating multiple genomic data to predict disease-causing nonsynonymous single nucleotide variants in exome sequencing studies. PLoS Genet 10(3):e1004237
Article PubMed PubMed Central Google Scholar
Jäger M et al (2014) Jannovar: a java library for exome annotation. Hum Mutat 35(5):548–555
Article Google Scholar
Li MX et al (2012) A comprehensive framework for prioritizing variants in exome sequencing studies of Mendelian diseases. Nucleic Acids Res 40(7):e53
Article CAS PubMed PubMed Central Google Scholar
Calabrese C et al (2014) MToolBox: a highly automated pipeline for heteroplasmy annotation and prioritization analysis of human mitochondrial variants in high-throughput sequencing. Bioinformatics 30(21):3115–3117
Article CAS PubMed PubMed Central Google Scholar
Yao J et al (2014) FamAnn: an automated variant annotation pipeline to facilitate target discovery for family-based sequencing studies. Bioinformatics 30(8):1175–1176
Article CAS Google Scholar
Li X, Montgomery SB (2013) Detection and impact of rare regulatory variants in human disease. Front Genet 4:67
PubMed PubMed Central Google Scholar
Matthews LR et al (2001) Identification of potential interaction networks using sequence-based searches for conserved protein-protein interactions or "interologs". Genome Res 11(12):2120–2126
Article CAS PubMed Central Google Scholar
Yu H et al (2004) Annotation transfer between genomes: protein-protein interologs and protein-DNA regulogs. Genome Res 14(6):1107–1118
Article CAS PubMed PubMed Central Google Scholar
Mewes HW et al (2011) MIPS: curated databases and comprehensive secondary data resources in 2010. Nucleic Acids Res 39(Database issue):D220–D224
Article CAS PubMed Google Scholar
St Onge RP et al (2007) Systematic pathway analysis using high-resolution fitness profiling of combinatorial gene deletions. Nat Genet 39(2):199–206
Article CAS PubMed PubMed Central Google Scholar
Bakal C et al (2008) Phosphorylation networks regulating JNK activity in diverse genetic backgrounds. Science 322(5900):453–456
Article CAS PubMed PubMed Central Google Scholar
Lage K et al (2008) A large-scale analysis of tissue-specific pathology and gene expression of human disease genes and complexes. Proc Natl Acad Sci U S A 105(52):20870–20875
Article CAS PubMed PubMed Central Google Scholar
Zuberi K et al (2013) GeneMANIA prediction server 2013 update. Nucleic Acids Res 41(Web Server issue):W115–W122
Article PubMed PubMed Central Google Scholar
Kamburov A et al (2013) The ConsensusPathDB interaction database: 2013 update. Nucleic Acids Res 41(Database issue):D793–D800
Article CAS PubMed Google Scholar
Niu Y, Otasek D, Jurisica I (2010) Evaluation of linguistic features useful in extraction of interactions from PubMed; application to annotating known, high-throughput and predicted interactions in I2D. Bioinformatics 26(1):111–119
Article CAS PubMed Google Scholar
Hu Z et al (2013) VisANT 4.0: Integrative network platform to connect genes, drugs, diseases and therapies. Nucleic Acids Res 41(Web Server issue):W225–W231
Article PubMed Central Google Scholar
Elefsinioti A et al (2011) Large-scale de novo prediction of physical protein-protein association. Mol Cell Proteomics 10(11):M111–010629
Article PubMed PubMed Central Google Scholar
Patil A, Nakai K, Nakamura H (2011) HitPredict: a database of quality assessed protein-protein interactions in nine species. Nucleic Acids Res 39(Database issue):D744–D749
Article CAS PubMed Google Scholar
Balaji S et al (2012) IMID: integrated molecular interaction database. Bioinformatics 28(5):747–749
Article CAS PubMed PubMed Central Google Scholar
Wong AK et al (2012) IMP: a multi-species functional genomics portal for integration, visualization and prediction of protein functions and networks. Nucleic Acids Res 40(Web Server issue):W484–W490
Article CAS PubMed Central Google Scholar
Tamames J, de Lorenzo V (2010) EnvMine: a text-mining system for the automatic extraction of contextual information. BMC Bioinformatics 11:294
Article PubMed PubMed Central Google Scholar
Gerner M et al (2012) BioContext: an integrated text mining system for large-scale extraction and contextualization of biomolecular events. Bioinformatics 28(16):2154–2161
Article CAS PubMed PubMed Central Google Scholar
Kahn AB et al (2007) SpliceMiner: a high-throughput database implementation of the NCBI Evidence Viewer for microarray splice variant analysis. BMC Bioinformatics 8:75
Article PubMed PubMed Central Google Scholar
Thanaraj TA et al (2004) ASD: the Alternative Splicing Database. Nucleic Acids Res 32(Database issue):D64–D69
Article CAS PubMed PubMed Central Google Scholar
Latham KE (2006) The Primate Embryo Gene Expression Resource in embryology and stem cell biology. Reprod Fertil Dev 18(8):807–810
Article CAS PubMed Google Scholar
Sulakhe D et al (2016) Lynx: a knowledge base and an analytical workbench for integrative medicine. Nucleic Acids Res 44(D1):D882–D887
Article CAS PubMed Google Scholar
Lukashin I et al (2011) VISTA Region Viewer (RViewer)--a computational system for prioritizing genomic intervals for biomedical studies. Bioinformatics 27(18):2595–2597
CAS PubMed PubMed Central Google Scholar
Xie B, et al (2012) Prediction of candidate genes for neuropsychiatric disorders using feature-based enrichment. Proceedings of the ACM conference on bioinformatics, computational biology and biomedicine, Association for Computing Machinery, pp 564–566
Google Scholar
Frazer KA et al (2004) VISTA: computational tools for comparative genomics. Nucleic Acids Res 32(Web Server issue):W273–W279
Article CAS PubMed Central Google Scholar
Nitsch D et al (2011) PINTA: a web server for network-based gene prioritization from expression data. Nucleic Acids Res 39(Web Server issue):W334–W338
Article CAS PubMed PubMed Central Google Scholar
Xie B et al (2015) Disease gene prioritization using network and feature. J Comput Biol 22(4):313–323
Article CAS PubMed PubMed Central Google Scholar
Xie B, et al (2013) Conditional random field for candidate gene prioritization. Proceedings of the international conference on bioinformatics, computational biology and biomedical informatics, Association for Computing Machinery, p 700
Google Scholar
Dubchak I et al (2014) An integrative computational approach for prioritization of genomic variants. PLoS One 9(12):e114903
Article PubMed PubMed Central Google Scholar
Nitsch D et al (2010) Candidate gene prioritization by network analysis of differential expression using machine learning approaches. BMC Bioinformatics 11:460
Article PubMed PubMed Central Google Scholar
Källberg M et al (2012) Template-based protein structure modeling using the RaptorX web server. Nat Protoc 7(8):1511–1522
Article PubMed PubMed Central Google Scholar
Rosenbloom KR et al (2015) The UCSC genome browser database: 2015 update. Nucleic Acids Res 43(Database issue):D670–D681
Article CAS Google Scholar
Mirzaa GM et al (2014) The developmental brain disorders database (DBDB): a curated neurogenetics knowledge base with clinical and research applications. Am J Med Genet A 164A(6):1503–1511
Article PubMed Google Scholar

Download references

Author information

Authors and Affiliations

Department of Human Genetics, University of Chicago, 920 E. 58th Street, Chicago, IL, 60637, USA
Mark D’Souza, Dinanath Sulakhe, Sheng Wang, Bing Xie, Andrew Taylor, T. Conrad Gilliam & Natalia Maltsev
Argonne National Laboratory, Building 221, Room: A142, 9700 South Cass Avenue, Argonne, IL, 60439, USA
Mark D’Souza
Computation Institute, University of Chicago, 5735 S. Ellis Avenue, Chicago, IL, 60637, USA
Dinanath Sulakhe, T. Conrad Gilliam & Natalia Maltsev
Toyota Technological Institute at Chicago, 6045 S. Kenwood Avenue, Chicago, IL, 60637, USA
Sheng Wang & Somaye Hashemifar
Department of Computer Science, Illinois Institute of Technology, Chicago, IL, 60616, USA
Bing Xie
Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America, Department of Energy Joint Genome Institute, Walnut Creek, CA, USA
Inna Dubchak

Authors

Mark D’Souza
View author publications
You can also search for this author in PubMed Google Scholar
Dinanath Sulakhe
View author publications
You can also search for this author in PubMed Google Scholar
Sheng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Bing Xie
View author publications
You can also search for this author in PubMed Google Scholar
Somaye Hashemifar
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Taylor
View author publications
You can also search for this author in PubMed Google Scholar
Inna Dubchak
View author publications
You can also search for this author in PubMed Google Scholar
T. Conrad Gilliam
View author publications
You can also search for this author in PubMed Google Scholar
Natalia Maltsev
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mark D’Souza .

Editor information

Editors and Affiliations

Keck School of Medicine, University of Southern California, Los Angeles, California, USA
Tatiana V. Tatarinova
Prosapia Genetics, Solana Beach, California, USA
Yuri Nikolsky

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

D’Souza, M. et al. (2017). Strategic Integration of Multiple Bioinformatics Resources for System Level Analysis of Biological Networks. In: Tatarinova, T., Nikolsky, Y. (eds) Biological Networks and Pathway Analysis. Methods in Molecular Biology, vol 1613. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7027-8_5

Download citation

DOI: https://doi.org/10.1007/978-1-4939-7027-8_5
Published: 29 August 2017
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-7025-4
Online ISBN: 978-1-4939-7027-8
eBook Packages: Springer Protocols

Publish with us

Policies and ethics