Abstract
Some terminologies and ontologies, such as SNOMED CT, allow for post–coordinated as well as pre-coordinated expressions. Post–coordinated expressions are, essentially, small segments of the terminology graphs. Compositional expressions add logical and linguistic relations to the standard technique of post-coordination. In indexing medical text, many instances of compositional expressions must be stored, and in performing retrieval on that index, entire compositional expressions and sub-parts of those expressions must be searched. The problem becomes a small graph query against a large collection of small graphs. This is further complicated by the need to also find sub-graphs from a collection of small graphs. In previous systems using compositional expressions, such as iNLP, the index was stored in a relational database. We compare retrieval characteristics of relational databases, triplestores, and general graph databases to determine which is most efficient for the task at hand.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Appropriate table indexes were created to speed execution as much as possible.
- 2.
- 3.
All tests were run on a laptop with a Core i7 4600U CPU, 16GB of RAm, and an SSD. Evaluation code was run in a VirtualBox VM running Ubuntu 14.04.
References
Andrš, J.: Metadata repository benchmark: PostgreSQL vs. Neo4j (2014). http://mantatools.com/metadata-repository-benchmark-postgresql-vs-neo4j
Angles, R.: A comparison of current graph database models. In: 2012 IEEE 28th International Conference on Data Engineering Workshops (ICDEW), pp. 171–177. IEEE (2012)
Ciglan, M., Averbuch, A., Hluchy, L.: Benchmarking traversal operations over graph databases. In: 2012 IEEE 28th International Conference on Data Engineering Workshops (ICDEW), pp. 186–189. IEEE (2012)
Dominguez-Sal, D., Urbón-Bayes, P., Giménez-Vañó, A., Gómez-Villamor, S., Martínez-Bazán, N., Larriba-Pey, J.L.: Survey of graph database performance on the HPC scalable graph analysis benchmark. In: Shen, H.T. (ed.) WAIM 2010. LNCS, vol. 6185, pp. 37–48. Springer, Heidelberg (2010)
Elkin, P.L., Brown, S.H., Husser, C.S., Bauer, B.A., Wahner-Roedler, D., Rosenbloom, S.T., Speroff, T.: Evaluation of the content coverage of snomed ct: ability of SNOMED clinical terms to represent clinical problem lists. In: Mayo Clinic Proceedings. vol. 81, pp. 741–748. Elsevier (2006)
Elkin, P.L., Froehling, D.A., Wahner-Roedler, D.L., Brown, S.H., Bailey, K.R.: Comparison of natural language processing biosurveillance methods for identifying influenza from encounter notes. Ann. Intern. Med. 156(1_Part_1), 11–18 (2012)
Elkin, P.L., Trusko, B.E., Koppel, R., Speroff, T., Mohrer, D., Sakji, S., Gurewitz, I., Tuttle, M., Brown, S.H.: Secondary use of clinical data. Stud Health Technol. Inform. 155, 14–29 (2010)
Microsoft: SQL server 2014 (2015). http://www.microsoft.com/en-us/server-cloud/products/sql-server/
Murff, H.J., FitzHenry, F., Matheny, M.E., Gentry, N., Kotter, K.L., Crimin, K., Dittus, R.S., Rosen, A.K., Elkin, P.L., Brown, S.H., et al.: Automated identification of postoperative complications within an electronic medical record using natural language processing. Jama 306(8), 848–855 (2011)
Neo Technology Inc: Neo4j, the world’s leading graph database. (2015). http://neo4j.com/
Ontotext: Ontotext GraphDB. (2015). http://ontotext.com/products/ontotext-graphdb/
Oracle: Database 11g R2 (2015). http://www.oracle.com/technetwork/database/index.html
Partner, J., Vukotic, A., Watt, N., Abedrabbo, T., Fox, D.: Neo4j in Action. Manning Publications Company, Greenwich (2014)
Rodriguez, M.: MySQL vs. Neo4j on a large-scale graph traversal (2011). https://dzone.com/articles/mysql-vs-neo4j-large-scale
Schlegel, D.R.: Concurrent Inference Graphs. Ph.D. thesis, State University of New York at Buffalo (2015)
Schlegel, D.R., Shapiro, S.C.: Visually interacting with a knowledge base using frames, logic, and propositional graphs. In: Croitoru, M., Rudolph, S., Wilson, N., Howse, J., Corby, O. (eds.) GKR 2011. LNCS, vol. 7205, pp. 188–207. Springer, Heidelberg (2012)
Shapiro, S.C., Rapaport, W.J.: The SNePS family. Comput. Math. Appl. 23(2–5), 243–275 (1992)
The International Health Terminology Standards Development Organisation:SNOMED CT technical implementation guide (July 2014)
W3C OWL Working Group: Owl 2 web ontology language document overview (2nd edn.) (2012). http://www.w3.org/TR/owl2-overview/
W3C RDF Working Group: Rdf 1.1 semantics (2014). http://www.w3.org/TR/rdf11-mt/
W3C RDF Working Group: Rdf schema 1.1 (2014). http://www.w3.org/TR/rdf-schema/
Zhao, F., Tung, A.K.: Large scale cohesive subgraphs discovery for social network visual analysis. Proc. VLDB Endowment 6(2), 85–96 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Schlegel, D.R., Bona, J.P., Elkin, P.L. (2016). Comparing Small Graph Retrieval Performance for Ontology Concepts in Medical Texts. In: Wang, F., Luo, G., Weng, C., Khan, A., Mitra, P., Yu, C. (eds) Biomedical Data Management and Graph Online Querying. Big-O(Q) DMAH 2015 2015. Lecture Notes in Computer Science(), vol 9579. Springer, Cham. https://doi.org/10.1007/978-3-319-41576-5_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-41576-5_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41575-8
Online ISBN: 978-3-319-41576-5
eBook Packages: Computer ScienceComputer Science (R0)