Skip to main content

Comparing Small Graph Retrieval Performance for Ontology Concepts in Medical Texts

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9579))

Abstract

Some terminologies and ontologies, such as SNOMED CT, allow for post–coordinated as well as pre-coordinated expressions. Post–coordinated expressions are, essentially, small segments of the terminology graphs. Compositional expressions add logical and linguistic relations to the standard technique of post-coordination. In indexing medical text, many instances of compositional expressions must be stored, and in performing retrieval on that index, entire compositional expressions and sub-parts of those expressions must be searched. The problem becomes a small graph query against a large collection of small graphs. This is further complicated by the need to also find sub-graphs from a collection of small graphs. In previous systems using compositional expressions, such as iNLP, the index was stored in a relational database. We compare retrieval characteristics of relational databases, triplestores, and general graph databases to determine which is most efficient for the task at hand.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Appropriate table indexes were created to speed execution as much as possible.

  2. 2.

    This is a simplified, impure, version of a propositional graph, as used in the SNePS family [17] of knowledge representation and reasoning systems, and for which a formal mapping is defined between logical expressions and the graph structure [15, 16].

  3. 3.

    All tests were run on a laptop with a Core i7 4600U CPU, 16GB of RAm, and an SSD. Evaluation code was run in a VirtualBox VM running Ubuntu 14.04.

References

  1. Andrš, J.: Metadata repository benchmark: PostgreSQL vs. Neo4j (2014). http://mantatools.com/metadata-repository-benchmark-postgresql-vs-neo4j

  2. Angles, R.: A comparison of current graph database models. In: 2012 IEEE 28th International Conference on Data Engineering Workshops (ICDEW), pp. 171–177. IEEE (2012)

    Google Scholar 

  3. Ciglan, M., Averbuch, A., Hluchy, L.: Benchmarking traversal operations over graph databases. In: 2012 IEEE 28th International Conference on Data Engineering Workshops (ICDEW), pp. 186–189. IEEE (2012)

    Google Scholar 

  4. Dominguez-Sal, D., Urbón-Bayes, P., Giménez-Vañó, A., Gómez-Villamor, S., Martínez-Bazán, N., Larriba-Pey, J.L.: Survey of graph database performance on the HPC scalable graph analysis benchmark. In: Shen, H.T. (ed.) WAIM 2010. LNCS, vol. 6185, pp. 37–48. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  5. Elkin, P.L., Brown, S.H., Husser, C.S., Bauer, B.A., Wahner-Roedler, D., Rosenbloom, S.T., Speroff, T.: Evaluation of the content coverage of snomed ct: ability of SNOMED clinical terms to represent clinical problem lists. In: Mayo Clinic Proceedings. vol. 81, pp. 741–748. Elsevier (2006)

    Google Scholar 

  6. Elkin, P.L., Froehling, D.A., Wahner-Roedler, D.L., Brown, S.H., Bailey, K.R.: Comparison of natural language processing biosurveillance methods for identifying influenza from encounter notes. Ann. Intern. Med. 156(1_Part_1), 11–18 (2012)

    Google Scholar 

  7. Elkin, P.L., Trusko, B.E., Koppel, R., Speroff, T., Mohrer, D., Sakji, S., Gurewitz, I., Tuttle, M., Brown, S.H.: Secondary use of clinical data. Stud Health Technol. Inform. 155, 14–29 (2010)

    Google Scholar 

  8. Microsoft: SQL server 2014 (2015). http://www.microsoft.com/en-us/server-cloud/products/sql-server/

  9. Murff, H.J., FitzHenry, F., Matheny, M.E., Gentry, N., Kotter, K.L., Crimin, K., Dittus, R.S., Rosen, A.K., Elkin, P.L., Brown, S.H., et al.: Automated identification of postoperative complications within an electronic medical record using natural language processing. Jama 306(8), 848–855 (2011)

    Article  Google Scholar 

  10. Neo Technology Inc: Neo4j, the world’s leading graph database. (2015). http://neo4j.com/

  11. Ontotext: Ontotext GraphDB. (2015). http://ontotext.com/products/ontotext-graphdb/

  12. Oracle: Database 11g R2 (2015). http://www.oracle.com/technetwork/database/index.html

  13. Partner, J., Vukotic, A., Watt, N., Abedrabbo, T., Fox, D.: Neo4j in Action. Manning Publications Company, Greenwich (2014)

    Google Scholar 

  14. Rodriguez, M.: MySQL vs. Neo4j on a large-scale graph traversal (2011). https://dzone.com/articles/mysql-vs-neo4j-large-scale

  15. Schlegel, D.R.: Concurrent Inference Graphs. Ph.D. thesis, State University of New York at Buffalo (2015)

    Google Scholar 

  16. Schlegel, D.R., Shapiro, S.C.: Visually interacting with a knowledge base using frames, logic, and propositional graphs. In: Croitoru, M., Rudolph, S., Wilson, N., Howse, J., Corby, O. (eds.) GKR 2011. LNCS, vol. 7205, pp. 188–207. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  17. Shapiro, S.C., Rapaport, W.J.: The SNePS family. Comput. Math. Appl. 23(2–5), 243–275 (1992)

    Article  MATH  Google Scholar 

  18. The International Health Terminology Standards Development Organisation:SNOMED CT technical implementation guide (July 2014)

    Google Scholar 

  19. W3C OWL Working Group: Owl 2 web ontology language document overview (2nd edn.) (2012). http://www.w3.org/TR/owl2-overview/

  20. W3C RDF Working Group: Rdf 1.1 semantics (2014). http://www.w3.org/TR/rdf11-mt/

  21. W3C RDF Working Group: Rdf schema 1.1 (2014). http://www.w3.org/TR/rdf-schema/

  22. Zhao, F., Tung, A.K.: Large scale cohesive subgraphs discovery for social network visual analysis. Proc. VLDB Endowment 6(2), 85–96 (2012)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel R. Schlegel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Schlegel, D.R., Bona, J.P., Elkin, P.L. (2016). Comparing Small Graph Retrieval Performance for Ontology Concepts in Medical Texts. In: Wang, F., Luo, G., Weng, C., Khan, A., Mitra, P., Yu, C. (eds) Biomedical Data Management and Graph Online Querying. Big-O(Q) DMAH 2015 2015. Lecture Notes in Computer Science(), vol 9579. Springer, Cham. https://doi.org/10.1007/978-3-319-41576-5_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-41576-5_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-41575-8

  • Online ISBN: 978-3-319-41576-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics