Recipes for Translating Big Data Machine Reading to Executable Cellular Signaling Models
Abstract
Biological literature is rich in mechanistic information that can be utilized to construct executable models of complex systems to increase our understanding of health and disease. However, the literature is vast and fragmented, and therefore, automation of information extraction from papers and of model assembly from the extracted information is necessary. We describe here our approach for translating machine reading outputs, obtained by reading biological signaling literature, to discrete models of cellular networks. We use outputs from three different reading engines, and demonstrate the translation of different features using examples from cancer literature. We also outline several issues that still arise when assembling cellular network models from state-of-the-art reading engines. Finally, we illustrate the details of our approach with a case study in pancreatic cancer.
Keywords
Machine reading Big data in literature Text mining Cell signaling networks Automated model generationReferences
- 1.Miskov-Zivanov, N.: Automation of biological model learning, design and analysis. In: Proceedings of the 25th Edition on Great Lakes Symposium on VLSI. ACM (2015)Google Scholar
- 2.Valenzuela-Escárcega, M.A., et al.: A domain-independent rule-based framework for event extraction. In: ACL-IJCNLP 2015, p. 127 (2015)Google Scholar
- 3.Hucka, M., et al.: The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics 19(4), 524–531 (2003)CrossRefGoogle Scholar
- 4.Droste, P., et al.: Visualizing multi-omics data in metabolic networks with the software Omix—a case study. Biosystems 105(2), 154–161 (2011)CrossRefGoogle Scholar
- 5.Büchel, F., et al.: Qualitative translation of relations from BioPAX to SBML qual. Bioinformatics 28(20), 2648–2653 (2012)CrossRefGoogle Scholar
- 6.Faeder, J.R., Blinov, M.L., Hlavacek, W.S.: Rule-based modeling of biochemical systems with BioNetGen. In: Systems Biology, pp. 113–167 (2009)Google Scholar
- 7.Hedengren, J.D., et al.: Nonlinear modeling, estimation and predictive control in APMonitor. Comput. Chem. Eng. 70, 133–148 (2014)CrossRefGoogle Scholar
- 8.Albert, R.: Scale-free networks in cell biology. J. Cell Sci. 118(21), 4947–4957 (2005)CrossRefGoogle Scholar
- 9.Pawson, T., Scott, J.D.: Protein phosphorylation in signaling–50 years and counting. Trends Biochem. Sci. 30(6), 286–290 (2005)CrossRefGoogle Scholar
- 10.Erwin, D.H., Davidson, E.H.: The evolution of hierarchical gene regulatory networks. Nat. Rev. Genet. 10(2), 141–148 (2009)CrossRefGoogle Scholar
- 11.Schuster, S., Fell, D.A., Dandekar, T.: A general definition of metabolic pathways useful for systematic organization and analysis of complex metabolic networks. Nat. Biotechnol. 18(3), 326–332 (2000)CrossRefGoogle Scholar
- 12.Schmitz, M.L., et al.: Signal integration, crosstalk mechanisms and networks in the function of inflammatory cytokines. Biochimica et Biophysica Acta (BBA)-Molecular Cell Research 1813(12), 2165–2175 (2011)CrossRefGoogle Scholar
- 13.Miskov-Zivanov, N., Marculescu, D., Faeder, J.R.: Dynamic behavior of cell signaling networks: model design and analysis automation. In: Proceedings of the 50th Annual Design Automation Conference. ACM (2013)Google Scholar
- 14.Sayed, K., et al.: DiSH simulator: capturing dynamics of cellular signaling with heterogeneous knowledge (2017). arXiv preprint arXiv:1705.02660
- 15.GO. Gene Ontology Database. http://geneontology.org/page/go-database
- 16.UniProt. UniProt Database. http://www.uniprot.org/
- 17.Pfam. Pfam Database. http://pfam.xfam.org/
- 18.InterPro. InterPro Database. https://www.ebi.ac.uk/interpro/
- 19.Bioentities. Bioentities Database. https://github.com/sorgerlab/bioentities
- 20.PubChem. PubChem Database. https://pubchem.ncbi.nlm.nih.gov/
- 21.HGNC. Database of Human Gene Names. http://www.genenames.org/
- 22.MeSH. MeSH Database. https://www.ncbi.nlm.nih.gov/mesh
- 23.REACH. Reading and Assembling Contextual and Holistic Mechanisms from Text (2016). http://agathon.sista.arizona.edu:8080/odinweb/
- 24.Burns, G.A., et al.: Automated detection of discourse segment and experimental types from the text of cancer pathway results sections. In: Database 2016, p. baw122 (2016)Google Scholar
- 25.Sloate, S., et al.: Extracting protein-reaction information from tables of unpredictable format and content in the molecular biology literature. In: Bioinformatics and Artificial Intelligence (BAI), New York (2016)Google Scholar
- 26.Sayed, K., Telmer, C.A., Miskov-Zivanov, N.: Motif modeling for cell signaling networks. In: 2016 8th Cairo International Biomedical Engineering Conference (CIBEC). IEEE (2016)Google Scholar