Journal of Intelligent Information Systems

, Volume 52, Issue 1, pp 33–55 | Cite as

An enhancement on Clinical Data Analytics Language (CliniDAL) by integration of free text concept search

  • Leila SafariEmail author
  • Jon D. Patrick


Much of the important patient information can only be found in patient narratives or in free text fields of structural schema of the Clinical Information System (CIS). So, the integration of free text search facilities will improve question answering on CISs. This paper describes a method for integrating free text search facility to the proposed Data Analytics Language (CliniDAL) to improve its capabilities at answering more common clinical questions. The proposed language constructs in CliniDAL’s grammar enables its parser to recognize the part of the Restricted Natural Language Query (RNLQ) of the CliniDAL interface, which needs a free text resolution mechanism. Then the Natural Language Processing (NLP) approach of the CliniSearch tool finds the correct matches with the query. The search result is integrated into the translated CliniDAL query which can be executed to return a more comprehensive answer to the initial text query. 160 queries are tested in the current work to investigate the improvements on answering more common questions from a CIS, which result in a simple taxonomy of four query categories of: unanswerable queries, queries that require more evidence to be answered, queries requiring user interpretation and queries with suitable answers. Compatibility of query results between the structural schema and patient progress notes is examined which showed the usability of the approach in answering queries, confirming the results from different sources and finding any inconsistency in the stored data in the CIS. The proposed solution provides a simple mechanism for extracting knowledge from CISs.


Knowledge discovery and reuse Question answering Free text search Clinical information systems 


  1. Aditya, B., Bhalotia, G., Chakrabarti, S., Hulgeri, A., Nakhe, C., Parag, P., Sudarshan, S. (2002). Banks: browsing and keyword searching in relational databases. In The 28th international conference on very large data bases (pp. 1083–1086): VLDB Endowment.Google Scholar
  2. Agrawal, S., Chaudhuri, S., Das, G. (2002). Dbxplorer: a system for keyword-based search over relational databases (pp. 5–16): IEEE.Google Scholar
  3. Chapman, W W, Bridewell, W, Hanbury, P, Cooper, G F, Buchanan, B G. (2001). A simple algorithm for identifying negated findings and diseases in discharge summaries. Journal of Biomedical Informatics, 34(5), 301–310.CrossRefGoogle Scholar
  4. Ganti, V, He, Y, Dong, X. (2010). Keyword+ +: a framework to improve keyword search over entity databases. Proceedings of the VLDB Endowment, 3(1–2), 711–722.CrossRefGoogle Scholar
  5. Hristidis, V., & Papakonstantinou, Y. (2002). Discover: keyword search in relational databasess. In The 28th international conference on very large data base (pp. 670–681). VLDB Endowment.Google Scholar
  6. Jiang, M, Chen, Y, Liu, M, Rosenbloom, TS, Mani, S, Denny, JC, Hua, X. (2011). A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries. Journal of the American Medical Informatics Association, 18(5), 601–606.CrossRefGoogle Scholar
  7. Luo, Y., Lin, X., Wang, W., Zhou, X. (2007). Spark: top-k keyword query in relational databases. In Proceedings of the 2007 ACM SIGMOD international conference on management of data, pp 115-126, June 11-14, 2007, Beijing, China. Google Scholar
  8. Patrick, J, & Li, M. (2010). High accuracy information extraction of medication information from clinical notes: 2009 i2b2 medication extraction challenge. Journal of the American Medical Informatics Association 2010, 17, 524–527.CrossRefGoogle Scholar
  9. Patrick, JD, Safari, L, Cheng, Y. (2013). Knowledge discovery and knowledge reuse in clinical information systems. In Proc. the 10th IASTED international conference on biomedical engineering (BioMed 2013), Innsbruck, Austria, 2013, pp. 228–236. Google Scholar
  10. Raghavan, P. (2001). Structured and unstructured search in enterprises. IEEE Data Engineering Bulletin, 24(4), 15–18.Google Scholar
  11. Safari, L., & Patrick, J. D. (2002). Mapping query terms to data and schema using content based similarity search in clinical information systems. In 35th annual international conference of the IEEE EMBS (pp. 4779–4782).Google Scholar
  12. Safari, L., & Patrick, J. D. (2013). A temporal model for clinical data analytics language. In 35th annual international conference of the IEEE EMBS (pp. 3218–3221).Google Scholar
  13. Safari, L, & Patrick, JD. (2014). Restricted natural language based querying of clinical databases. Journal of Biomedical Informatics, 52, 338–353.CrossRefGoogle Scholar
  14. Savova, G, Kipper-Schuler, K, Buntrock, J, Chute, C. (2008). Uima-based clinical information extraction system. In Towards enhanced interoperability for large HLT systems: UIMA for NLP (p. 39).Google Scholar
  15. Su, Q., & Widom, J. (2005). Indexing relational database content offline for efficient keyword-based search. In 9th international symposium on database engineering and application (pp. 297–306). IEEE.Google Scholar
  16. Wang, C, Xiong, M, Zhou, Q, Yu, Y. (2007). Panto: a portable natural language interface to ontologies (pp. 473–487). Berlin: Springer.Google Scholar
  17. Yu, J.X., Qin, L., Chang, L. (2010a). Intelligent clinical notes system: an information retrieval and information extraction system for clinical notes. IEEE Data Engineering Bulletin, 33(1), 67–78.Google Scholar
  18. Yu, J.X., Qin, L., Chang, L. (2010b). Keyword search in relational databases: a survey. IEEE Data Engineering Bulletin, 33(1), 67–78.Google Scholar
  19. Zhou, Q, Wang, C, Xiong, M, Wang, H, Yu, Y. (2007). SPARK: adapting keyword query to semantic search, (pp. 694–707). Berlin: Springer.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.School of Computer Engineering, Faculty of EngineeringThe University of ZanjanZanjanIran
  2. 2.Health Language AnalyticsSydneyAustralia

Personalised recommendations