Processing Text in Medical Databases

  • Morris F. Collen
Part of the Health Informatics book series (HI)


In the 1950s the clinical data in medical records of patients in the United States were mostly recorded in a natural, English-language, textual form. This was commonly done by physicians when recording their notes on paper sheets for a patient’s medical history and physical examination, for reporting their interpretations of x-ray images and electrocardiograms, and for their dictated descriptions of medical and surgical procedures. Such patients’ data were generally recorded by health-care professionals as hand-written notes, or as dictated reports that were then transcribed and typed on paper sheets, that were all collated in paper-based charts; and these patients’ medical charts were then stored on shelves in the medical record room. The process of manually retrieving data from patients’ paper-based medical charts was always cumbersome and time consuming. An additional frequent problem was when a patient was seeing more than one physician on the same day in the same medical facility; then that patient’s paper-based chart was often left in the first doctor’s office, and therefore was not available to the other physicians who then had to see the patient without having access to any recorded prior patient’s information. Pratt (1974) observed that the data a medical professional recorded and collected during the care of a patient was largely in a non-numeric form, and in the United States was formulated almost exclusively in English language. He noted that a word, a phrase, or a sentence in this language was generally understood when spoken or read; and the marks of punctuation and the order of the presentation of words in a sentence represented quasi-formal structures that could be analyzed for content according to common rules for: (a) the recognition and validation of the string of language data that was a matter of morphology and syntax; (b) the recognition and the registration of each datum and of its meaning that was a matter of semantics; and (c) the mapping of the recognized, defined, syntactical and semantic elements into a data structure reflected the informational content of the original language data string, and (d) that these processes required definition and interpretation of the information by the user.


Natural Language Processing Textual Data Discharge Summary Unify Medical Language System Structure Query Language 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Adams LB. Three surveillance and query languages. MD Comput. 1986;3:11–9.PubMedGoogle Scholar
  2. Addison CH, Blackwell PW, Smith WE, et al. GYPSY: General information processing system remote terminal users guide. Information science series, Monograph No. 3, Norman: University of Oklahoma; 1969.Google Scholar
  3. Anderson MF, Moazamipour H, Hudson DL, Cohen ME. The role of the Internet in medical decision-making. Int J Med Inform. 1997;47:43–9.PubMedCrossRefGoogle Scholar
  4. Bakken S, Hyun S, Friedman C, Johnson S. A comparison of semantic categories of the ISO reference terminology models for nursing and the MedLEE natural language processing system. Proc MEDINFO. 2004:472–6.Google Scholar
  5. Barnett GO, Hoffman PB. Computer technology and patient care; experiences of a hospital research effort. Inquiry. 1968;5:51–7.Google Scholar
  6. Barnett GO, Greenes RA, Grossman JM. Computer processing of medical text information. Methods Inf Med. 1969;8:177–82.PubMedGoogle Scholar
  7. Barrows RC, Busuioc M, Friedman C. Limited parsing of notational text visit notes: Ad-hoc vs. NLP approaches. Proc AMIA. 2000:51–5.Google Scholar
  8. Bishop CW. A name is not enough. MD Comput. 1989;6:200–6.PubMedGoogle Scholar
  9. Blois MS. Medical records and clinical data bases: what is the difference. Proc AMIA. 1982:86–9.Google Scholar
  10. Blois MS. Information and medicine: the nature of medical descriptions. Berkeley: University of California Press; 1984.Google Scholar
  11. Blois MS, Tuttle MS, Shererts D. RECONSIDER: a program for generating differential diagnoses. Proc SCAMC. 1981:263–8.Google Scholar
  12. Borlawsky TB, Li J, Shagina L, et al. Evaluation of an ontology-anchored natural language-based approach for asserting multi-scale biomolecular networks for systems medicine. Proc AMIA CRI. 2010:6–10.Google Scholar
  13. Broering NC, Potter J, Mistry P. Linking bibliographic and information databases: an IAIMS prototype. Proc AAMSI. 1987:169–73.Google Scholar
  14. Broering NC, Bagdoyan H, Hylton J, Strickler J. BioSYNTHESIS: integrating multiple databases into a virtual database. Proc SCAMC. 1989:360–4.Google Scholar
  15. Buck ER, Reese GR, Lindberg DAB. A general technique for computer processing of coded patient diagnoses. Mo Med. 1966;68:276–9, 285.Google Scholar
  16. Campbell KE, Cohn SP, Chute CG, et al. Galapagos: computer-based support for evolution of a convergent medical terminology. Symp AMIA. 1996:26–273.Google Scholar
  17. Campbell KE, Cohn SP, Chute CG, et al. Scalable methodologies for distributed development of logic-based convergent medical terminology. Methods Inf Med. 1998;37:426–39.PubMedGoogle Scholar
  18. Campion TR, Weinberg ST, Lorenzi NM, Waltman LR. Evaluation of computerized free-text sign-out notes. Appl Clin Inform. 2010;1:304–17.PubMedCrossRefGoogle Scholar
  19. Cao H, Chiang MF, Cimino J, Friedman C, Hripcsak G. Automatic summarization of patient discharge summaries to create problem lists using medical language processing. Proc MEDINFO. 2004:1540.Google Scholar
  20. Chamberlin DD, Boyce RF. SEQUEL: a structured English query language. Proc ACM SIGFIDET workshop on data description, access and control. 1974:249–64.Google Scholar
  21. Chen ES, Hripsak G, Friedman C. Disseminating natural language processed clinical narratives. Proc AMIA Annu Symp. 2006:126–30.Google Scholar
  22. Chen ES, Hripcsak G, Xu H, et al. Automated acquisition of disease-drug knowledge from biomedical and clinical documents: an initial study. J Am Med Inform Assoc. 2008;15:87–98.PubMedCrossRefGoogle Scholar
  23. Childs LC, Enelow R, Simonsen L, et al. Description of a rule-based system for the i2b2 challenge in natural language processing for clinical data. J Am Med Inform Assoc. 2009;16:571–5.PubMedCrossRefGoogle Scholar
  24. Chueh H, Murphy S. The i2b2 (informatics for integrating biology and the bedside) hive and the clinical research chart. 2006:1–58.
  25. Chute CG. The Copernican era of healthcare terminology: a re-centering of health information systems. Proc AMIA. 1998:68–73.Google Scholar
  26. Chute CC. The journey of meaningful use. In Interoperability Reviews, AMIA The Standards Standard 2010;1:3–4.Google Scholar
  27. Chute CG, Crowson DL, Buntrock JD. Medical information retrieval and WWW browsers at Mayo. Proc AMIA. 1995:903–7.Google Scholar
  28. Chute CG, Elkin PL, Sheretz DD, Tuttle MS. Desiderata for a clinical terminology server. Proc AMIA. 1999:42–6.Google Scholar
  29. Cimino JJ. Linking patient information systems to bibliographic resources. Methods Inf Med. 1996;35:122–6.PubMedGoogle Scholar
  30. Cimino JJ. Desiderata for controlled medical vocabularies in the twenty-first century. Methods Inf Med. 1998;37:394–403.PubMedGoogle Scholar
  31. Cimino JJ. From data to knowledge through concept-oriented terminologies. J Am Med Inform Assoc. 2000;7:288–97.PubMedCrossRefGoogle Scholar
  32. Cimino JJ, Barnett GO. Automated translation between medical terminologies using semantic definitions. MD Comput. 1990;7:104–9.PubMedGoogle Scholar
  33. Cimino JJ, Aguirre A, Johnson SB, Peng P. Generic queries for meeting clinical information needs. Bull Med Libr Assoc. 1993;81:195–205.PubMedGoogle Scholar
  34. Cimino JJ, Clayton PD, Hripsak G, Johnson SB. Knowledge-based approaches to the maintenance of a large controlled medical terminology. J Am Med Inform Assoc. 1994;1:35–50.PubMedCrossRefGoogle Scholar
  35. Cimino JJ, Socratous SA, Grewal R. The informatics superhighway: prototyping on the World Wide Web. Proc SCAMC. 1995:111–5.Google Scholar
  36. Codd EF. A relational model of data for large shared data banks. Commun ACM. 1970;13:377–87.CrossRefGoogle Scholar
  37. Codd EF, Codd SB, Salley CT. Providing OLAP (On-line analytical processing) to user-analysts: an IT Mandate. San Jose: Codd & Date, Inc.; 1993.Google Scholar
  38. Connolly TM, Begg CE. Database management systems: a practical approach to design, implementation, and management. 2nd ed. New York: Addison-Wesley; 1999.Google Scholar
  39. Cote RA. The SNOP-SNOMED concept: evolution towards common medical nomenclature and classification. Pathologist. 1977;31:383–9.Google Scholar
  40. Cote RA. Architecture of SNOMED, its contribution to medical language processing. Proc SCAMC. 1986:74–84.Google Scholar
  41. Cousins SB, Silverstein JC, Frisse ME. Query networks for medical information retrieval – assigning probabilistic relationships. Proc SCAMC. 1990:800–4.Google Scholar
  42. Das AK, Musen MA. A comparison of the temporal expressiveness of three database query methods. Proc AMIA. 1995:331–7.Google Scholar
  43. Demuth AI. Automated ICD-9-CM coding: an inevitable trend to expert systems. Health Care Commun. 1985;2:62–5.Google Scholar
  44. Denny JC, Ritchie MD, Basford MA, et al. PheWAS: Demonstrating the feasibility of a phenome-wide scan to discover gene-disease associations. Bioinformatics. 2010;26:1205–2010.PubMedCrossRefGoogle Scholar
  45. Dolin RH, Spackman K, Abilla A, et al. The SNOMED RT procedure model. Proc AMIA. 2001:139–43.Google Scholar
  46. Dolin RH, Mattison JE, Cohn S, et al. Kaiser Permanente’s convergent medical terminology. Proc MEDINFO. 2004:346–50.Google Scholar
  47. Doszkocs TE. CITE NLM: natural-language searching in an online catalog. Inf Technol Libr. 1983;2:364–80.Google Scholar
  48. Dozier JA, Hammond WE, Stead WW. Creating a link between medical and analytical databases. Proc SCAMC. 1985:478–82.Google Scholar
  49. Eden M. Storage and retrieval of the results of clinical research. Proc IRE Trans Med Electronics (ME-7). 1960:265–8.Google Scholar
  50. Enlander D. Computer data processing of medical diagnoses in pathology. Am J Clin Pathol. 1975;63:538–44.PubMedGoogle Scholar
  51. Farrington JF. CPT-4: a computerized system of terminology and coding. In: Emlet HE, editor. Challenges and prospects for advanced medical systems. Miami: Symposia Specialists; 1978. p. 147–50.Google Scholar
  52. Feinstein AR. Unsolved scientific problems in the nosology of clinical medicine. Arch Int Med. 1988;148:2269–74.CrossRefGoogle Scholar
  53. Forman BH, Cimino JJ, Johnson SB, et al. Applying a controlled terminology to a distributed, production clinical information system. Proc AMIA. 1995:421–5.Google Scholar
  54. Friedman C. Towards a comprehensive medical language processing system: methods and issues. Proc AMIA. 1997:595–9.Google Scholar
  55. Friedman C. A broad-coverage natural language processing system. Proc AMIA. 2000:270–4.Google Scholar
  56. Friedman C, Hripcsak G. Evaluating natural language processors in the clinical domain. Methods Inf Med. 1998;37:334–44.PubMedGoogle Scholar
  57. Friedman C, Hripcsak G. Natural language processing and its future in medicine: can computers make sense out of natural language text. Acad Med. 1999;74:890–5.PubMedCrossRefGoogle Scholar
  58. Friedman C, Johnson SB. Medical text processing: past achievements, future directions. Chap 13. In: Ball MJ, Collen MF, editors. Aspects of the computer-based patient record. New York: Springer; 1992. p. 212–28.Google Scholar
  59. Friedman C, Alderson PO, Austin JHM, et al. A general natural-language text processor for clinical radiology. J Am Med Inform Assoc. 1994;1:161–74.PubMedCrossRefGoogle Scholar
  60. Friedman C, Hripcsak G, DuMouchel W, et al. Natural language processing in an operational clinical information system. Nat Lang Eng. 1995a;1:83–108.CrossRefGoogle Scholar
  61. Friedman C, Johnson SB, Forman B, Starren J. Architectural requirements for a multipurpose natural language processor in the clinical environment. Proc AMIA. 1995b:347–51.Google Scholar
  62. Friedman C, Shagina L, Socratous S, Zeng X. A WEB-based version of MedLEE: a medical language extraction and encoding system. Proc AMIA. 1996:938.Google Scholar
  63. Friedman C, Hripcsak G, Shablinsky I. An evaluation of natural language processing methodologies. Proc AMIA. 1998b:855–9.Google Scholar
  64. Friedman C, Shagina L, Lussier Y, Hripcsak G. Automated encoding of clinical documents based on natural language processing. J Am Med Inform Assoc. 2004;11:392–402.PubMedCrossRefGoogle Scholar
  65. Frisse ME. Digital libraries and information retrieval. Proc AMIA. 1996:320–2.Google Scholar
  66. Frisse ME, Cousins SB. Query by browsing: an alternative hypertext information retrieval method. Proc SCAMC. 1989:3880391.Google Scholar
  67. Fusaro VA, Kos PJ, Tector M, et al. Electronic medical record analysis using cloud computing. Proc AMIA CRI. 2010:90.Google Scholar
  68. Gabrieli ER. The medicine-compatible computer: a challenge for medical informatics. Methods Inf Med. 1984;9:233–50.Google Scholar
  69. Gabrieli ER. Computerizing text from office records. MD Comput. 1987;4:444–9.Google Scholar
  70. Gainer V, Goryachev S, Zeng Q, et al. Using derived concepts from electronic medical records for discovery research in informatics for integrating biology and the bedside (i2b2). Proc AMIA TBI. 2010:91.Google Scholar
  71. Gantner GE. SNOMED: the Systematized Nomenclature of Medicine as an ideal standard language for computer applications in medical care. Proc SCAMC. 1980:1224–6.Google Scholar
  72. Goldstein L. MEDUS/A: a high-level database management system. Proc SCAMC. 1980:1653–60.Google Scholar
  73. Gordon BL. Standard medical terminology. JAMA. 1965;191:311–3.PubMedCrossRefGoogle Scholar
  74. Gordon BL. Biomedical language and format for manual and computer applications. Dis Chest. 1968;53:38–42.CrossRefGoogle Scholar
  75. Gordon BL. Terminology and content of the medical record. Comput Biomed Res. 1970;3:436–44.PubMedCrossRefGoogle Scholar
  76. Gordon BI. Linguistics for medical records. In: Driggs MF, editor. Problem-directed and medical information systems. New York: Intercontinental Medical Book Co; 1973. p. 5–13.Google Scholar
  77. Graepel PH. Manual and automatic indexing of the medical record: categorized nomenclature (SNOP) versus classification (ICD). Med Inform. 1976;1:77–86.CrossRefGoogle Scholar
  78. Graepel PH, Henson DE, Pratt AW. Comments on the use of Systematized Nomenclature of Pathology. Methods Inf Med. 1975;14:72–5.PubMedGoogle Scholar
  79. Grams RR, Jin ZM. The natural language processing of medical databases. J Med Syst. 1989;2:79–87.CrossRefGoogle Scholar
  80. Hammond WE, Straube MJ, Blunden PB, Stead WW. Query: the language of databases. Proc SCAMC. 1989:419–23.Google Scholar
  81. Haug PJ, Warner HR. Decision-driven acquisition of qualitative data. Proc SCAMC. 1984:189–92.Google Scholar
  82. Haug PJ, Gardner RM, Tate KE, et al. Decision support in medicine: examples from the HELP System. Comput Biomed Res. 1994;27:396–418.PubMedCrossRefGoogle Scholar
  83. Hendrix GG, Sacerdota ED. Natural language processing; the field in perspective. Byte. 1981;6:304–52.Google Scholar
  84. Henkind SJ, Benis AM, Teichholz LE. Quantification as a means to increase the utility of nomenclature-classification systems. Proc MEDINFO. 1986:858–61.Google Scholar
  85. Hersh WR. Informatics retrieval at the millennium. Proc AMIA. 1998:38–45.Google Scholar
  86. Hersh WR, Donohue LC. SAPHIRE International: a tool for cross-language information retrieval. Proc AMIA. 1998:673–7.Google Scholar
  87. Hersh WR, Greenes RA. SAPHIRE – An information retrieval system featuring concept matching, automatic indexing. probabilistic retrieval, and hierarchical relationships. Comput Biomed Res. 1990;23:410–25.PubMedCrossRefGoogle Scholar
  88. Hersh WR, Hickam D. Information retrieval in medicine: the SAPHIRE experience. Proc MEDINFO. 1995:1433–7.Google Scholar
  89. Hersh WR, Leone TJ. The SAPHIRE server: a new algorithm and implementation. Proc AMIA. 1995 858–63.Google Scholar
  90. Hersh WR, Pattison-Gordon E, Evans DA. Adaptation of Meta-1 for SAPHIRE, A general purpose information retrieval program. Proc SCAMC. 1990b:156–60.Google Scholar
  91. Hersh WR, Campbell EH, Evans DA, Brownlow ND. Empirical, automated vocabulary discovery using large text corpora and advanced natural language processing tools. Proc AMIA. 1996a 159–63.Google Scholar
  92. Hersh WR, Brown KE, Donohoe LC, et al. CliniWeb: managing clinical information on the World Wide Web. JAMIA. 1996b;3(4):273–80.PubMedGoogle Scholar
  93. Himes BE, Kohane IS, Ramoni MF, Weiss ST. Characterization of patients who suffer asthma using data extracted from electronic medical records. Proc AMIA Ann Symp. 2008:308–12.Google Scholar
  94. Hogan WR, Wagner MM. Free-text fields change the meaning of coded data. Proc AMIA. 1996:517–21.Google Scholar
  95. Hogarth MA, Gertz M, Gorin FA. Terminology query language: a server interface for concept-oriented terminology systems. Proc AMIA. 2000:349–53.Google Scholar
  96. Hripcsak G, Friedman C, Alderson PO, et al. Unlocking clinical data from narrative reports: a study of natural language processing. Ann Intern Med. 1995;122:681–8.PubMedGoogle Scholar
  97. Hripcsak G, Allen B, Cimino JJ, Lee R. Access to data: comparing AcessMed with Query by Review. J Am Med Inform Assoc. 1996;3:288–99.PubMedCrossRefGoogle Scholar
  98. Humphreys BL. De facto, de rigeur, and even useful: standards for the published literature and their relationship to medical informatics. Proc SCAMC. 1990:2–8.Google Scholar
  99. Humphreys BL, Lindberg DAB. Building the unified medical language. Proc SCAMC. 1989:475–80.Google Scholar
  100. Jacobs H. A natural language information retrieval system. Proc 8th IBM Med Symp; Poughkeepsie; 1967:47–56.Google Scholar
  101. Jacobs H. A natural language information retrieval system. Methods Inf Med. 1968;7:8–16.PubMedGoogle Scholar
  102. Johnson SB. Conceptual graph grammar – a simple formalism for sublanguage. Methods Inf Med. 1998;37:345–52.PubMedGoogle Scholar
  103. Johnson SB, Friedman C. Integrating data from natural language processing into a clinical information system. Proc AMIA. 1996:537–41.Google Scholar
  104. Johnson SB, Aguirre A, Peng P, Cimino J. Interpreting natural language queries using the UMLS. Proc AMIA. 1994:294–8.Google Scholar
  105. Johnson KB, Rosenbloom ST, et al. Computer-based documentation: past, present, and future, Chap 14. In: Lehman HP, Abbott PA, Roderer NK, editors. Aspects of electronic health record systems. 2nd ed. 2006. p. 309–28.Google Scholar
  106. Johnston HB, Higgins SB, Harris TR, Lacy WW. The effect of a CLINFO management and analysis system on clinical research. Proc MEDCOMP. IEEE, 1982a:517–8.Google Scholar
  107. Johnston HB, Higgins SB, Harris TR, Lacy WW. Five years experience with the CLINFO data base management and analysis system. Proc SCAMC. 1982b:833–6.Google Scholar
  108. Karpinski RHS, Bleich HL. MISAR: a miniature information storage and retrieval system. Comput Biomed Res. 1971;4:655–71.PubMedCrossRefGoogle Scholar
  109. Katz B. Clinical research system. MD Comput. 1986;3:53–5, 61.PubMedGoogle Scholar
  110. Kementsietsidis A, Lipyeow L, Wang M. Profile-based retrieval of records in medical databases. Proc AMIA Annu Symp. 2009:312–6.Google Scholar
  111. Kent A. Computers and biomedical information storage and retrieval. JAMA. 1966;196:927–32.PubMedCrossRefGoogle Scholar
  112. King C, Strong RM, Dovovan K. MEDUS/A: 1983 status of a database system for research and patient care. Proc SCAMC. 1983a:709–11.Google Scholar
  113. King C, Strong RM, Goldstein L. MEDUS/A: Distributing database management for research and patient data. Proc SCAMC. 1988:818–26.Google Scholar
  114. Kingsland LC. RDBS: Research data base system for microcomputers; coding techniques and file structures. Proc AAMSI Conf. 1982:85–9.Google Scholar
  115. Korein J. The computerized medical record. The variable-field-length format system and its applications. Proc IFIPS TCH Conf. 1970:259–91.Google Scholar
  116. Korein J, Tick L, Woodbury MA, et al. Computer processing of medical data by variable-field-length format. JAMA. 1963;186:132–8.PubMedCrossRefGoogle Scholar
  117. Korein J, Goodgold AJ, Randt CT. Computer processing of medical data by variable-field-length format. II: progress and application to narrative documents. JAMA. 1966;196:950–6.PubMedCrossRefGoogle Scholar
  118. Lacson R, Long W. Natural language processing of spoken diet records. Proc AMIA Annu Symp Proc. 2006:454–8.Google Scholar
  119. Lamson BG, Glinsky BC, Hawthorne GS, et al. Storage and retrieval of uncoded tissue pathology diagnoses in the original English free-text. Proc 7th IBM Med Symp; Poukeepsie; 1965:411–26.Google Scholar
  120. Layard MW, McShane DJ. Applications of MEDLOG, A microprocessor-based system for time-oriented clinical data. Proc SCAMC. 1983:731–4.Google Scholar
  121. Levy AH, Lawrance DP. Information retrieval, Chap 7. In: Ball MJ, Collen MF, editors. Aspects of the computer-based patient record. New York: Springer; 1992. p. 146–52.Google Scholar
  122. Levy C, Rogers E. Clinician oriented access to data – C.O.A.D. A natural language interface to a VA DHCP database. Proc AMIA. 1995:933.Google Scholar
  123. Lincoln TL, Groner GF, Quinn JJ, Lukes RJ. The analysis of functional studies in acute lymphatic leukemia using CLINFO – A small computer information and analysis system for clinical investigators. Med Inform. 1976;1:95–103.CrossRefGoogle Scholar
  124. Lindberg DAB. The computer and medical care. Springfield: Charles C. Thomas; 1968.Google Scholar
  125. Lindberg DAB, Rowland LR, Bush WF, et al. CONSIDER: a computer program for medical instruction. 9th IBM Med Symp. 1968:59–61.Google Scholar
  126. Logan JR, Britell S, Delcambre LM, et al. Representing multi-database study schemas for reusability. Proc STB. 2010:21–5.Google Scholar
  127. Lupovitch A, Memminger JJ, Corr RM. Manual and computerized cumulative reporting systems for the clinical microbiology laboratory. Am J Clin Pathol. 1979;72:841–7.PubMedGoogle Scholar
  128. Lussier YA, Rothwell DJ, Cote RA. The SNOMED model: a knowledge source for the controlled terminology of the computerized patient record. Methods Inf Med. 1998;37:161–4.PubMedGoogle Scholar
  129. Lussier Y, Borlawski T, Rappaport D, et al. PHENOGO: assigning phenotypic context to gene ontology annotations with natural language processing. Pac Symp Biocomput. 2006;11:64–75.CrossRefGoogle Scholar
  130. Lyman M, Sager N, Friedman C, Chi E. Computer-structured narrative in ambulatory care: its use in longitudinal review of clinical data. Proc SCAMC. 1985:82–6.Google Scholar
  131. Mabry JC, Thompson HK, Hopwood MD, Baker WR. A prototype data management and analysis system (CLINFO): system description and user experience. Proc MEDINFO. 1977:71–5.Google Scholar
  132. Mays E, Weida R, Dionne R, et al. Scalable and expressive medical terminologies. Proc AMIA. 1996:259–63.Google Scholar
  133. McCormick BH, Chang SK, Boroved RT, et al. Technological trends in clinical information systems. Proc MEDINFO. 1977:43–8.Google Scholar
  134. McCormick PJ, Elhadad N, Stetson PD, et al. Use of semantic features to classify patient smoking status. Proc AMIA. 2008:450–4.Google Scholar
  135. McCray AT. The nature of lexical knowledge. Methods Inf Med. 1998;37:353–60.PubMedGoogle Scholar
  136. McCray AT, Sponsler JL, Brylawski B, Browne AC. The role of lexical knowledge in biomedical text understanding. Proc SCAMC. 1987:103–7.Google Scholar
  137. McCray AT, Bodenreider O, Malley JD, Browne AC. Evaluating UMLS strings for natural language processing. Proc AMIA. 2001 448–52.Google Scholar
  138. McDonald CJ. Protocol-based computer reminders, the quality of care and the non-perfectibility of man. N Engl J Med. 1976;295:1351–5.PubMedCrossRefGoogle Scholar
  139. McDonald CJ, Blevens L, Glazener T, et al. Data base management, feedback control and the Regenstrief medical record. Proc SCAMC. 1982:52–60.Google Scholar
  140. Melski JW, Geer DE, Bleich HL. Medical information storage and retrieval using preprocessed variables. Comput Biomed Res. 1978;11:613–21.PubMedCrossRefGoogle Scholar
  141. Mendonca EA, Cimino JJ, Johnson SB, Seol YH. Accessing heterogeneous sources of evidence to answer clinical questions. J Biomed Inform. 2001;34:85–98.PubMedCrossRefGoogle Scholar
  142. Meystre S, Haug PJ. Medical problem and document model for natural language understanding. Proc AMIA Ann Symp. 2003:455–9.Google Scholar
  143. Meystre SM, Haug PJ. Comparing natural language processing tools to extract medical problems from narrative text. Proc AMIA Annu Symp. 2005:525–9.Google Scholar
  144. Meystre SM, Deshmukh VG, Mitchell J. A clinical use case to evaluate the i2b2 Hive: predicting asthma exacerbations. Proc AMIA Annu Symp. 2009:442–6.Google Scholar
  145. Miller PB, Strong RM. Clinical care and research using MEDUS/A, a medically oriented data base management system. Proc SCAMC. 1978:288–97.Google Scholar
  146. Miller RA, Kapoor WN, Peterson J. The use of relational databases as a tool for conducting clinical studies. Proc SCAMC. 1983:705–8.Google Scholar
  147. Mirel BR, Wright DZ, Tenenbaum JD, et al. User requirements for exploring a resource inventory for clinical research. Proc AMIA CRI. 2010:31–5.Google Scholar
  148. Morgan MM, Beaman PD, Shusman DL, et al. Medical query language. Proc SCAMC. 1981:322–5.Google Scholar
  149. Mullins HC, Scanland PM, Collins D, et al. The efficacy of SNOMED, Read Codes, and UMLS in coding ambulatory family practice clinical records. Proc AMIA. 1996:135–9.Google Scholar
  150. Munoz F., Hersh W. MCM Generastors: a Java-based tool for generating medical metadata. Proc AMIA. 1998:648–52.Google Scholar
  151. Murphy SN, Morgan MM, Barnett GO, Chueh HC. Optimizing healthcare research data warehouse design through a past COSTAR query analysis. Proc AMIA. 1999:892–6.Google Scholar
  152. Murphy SN, Mendis M, Hackett K, et al. Architecture of the open-source clinical research chart from informatics for integrating biology and the bedside. Proc AMIA. 2007:548–52.Google Scholar
  153. Myers J, Gelblat M, Enterline HT. Automatic encoding of pathology data. Arch Pathol. 1970;89:73–8.PubMedGoogle Scholar
  154. Nelson S, Hoffman S, Karnekal H, Varma A. Making the most of RECONSIDER; an evaluation of input strategies. Proc SCAMC. 1983:852–5.Google Scholar
  155. Nielson J, Wilcox A. Linking structured text to medical knowledge. Proc MEDINFO. 2004:1777.Google Scholar
  156. Nigrin DJ, Kohane IS. Scaling a data retrieval and mining application to the enterprise-wide level. Proc AMIA. 1999:901–5.Google Scholar
  157. NIH-DRR: General Clinical Research Centers, A Research Resources Directory, seventh revised edition. Bethesda: Division of Res Resources, NIH; 1988.Google Scholar
  158. Niland JC, Rouse L, et al. Clinical research needs, Chap 3. In: Lehman HP, Abbott PA, Roderer NK, editors. Aspects of electronic health record systems. New York: Springer; 2006. p. 31–46.Google Scholar
  159. Nunnery AW. A medical information storage and statistical system (MICRO-MISSY). Proc SCAMC. 1984:383–5.Google Scholar
  160. O’Connor MJ, Samson W, Musen MA. Representation of temporal indeterminacy in clinical databases. Proc AMIA Symp. 2000:615–9.Google Scholar
  161. Obermeier KK. Natural-language processing, an introductory look at some of the technology used in this area of artificial intelligence. BYTE. 1987;12:225–32.Google Scholar
  162. Okubo RS, Russell WS, Dimsdale B, Lamson BG. Natural language storage and retrieval of medical diagnostic information. Comput Programs Biomed. 1975;75:105–30.CrossRefGoogle Scholar
  163. Oliver DE, Barnes MR, Barnett GO, et al. InterMed: an Internet-based medical collaboratory. Proc AMIA. 1995:1023.Google Scholar
  164. Oliver DE, Shortliffe EH, et al. Collaborative model development for vocabulary and guidelines. Proc AMIA. 1996:826.Google Scholar
  165. Olson NE, Sheretz, Erlbaum MS, et al. Explaining your terminology to a computer. Proc AMIA. 1995:957.Google Scholar
  166. Ozbolt JG, Russo M, Stultz MP. Validity and reliability of standard terms and codes for patient care data. Proc AMIA. 1995:37–41.Google Scholar
  167. Pendse N. Online analytical processing. Wikipedia. Retrieved in 2008. http://en.wikipedia:org/wiki/Online_analytical_processing.Google Scholar
  168. Porter D, Safran C. On-line searches of a hospital data base for clinical research and patient care. Proc SCAMC. 1984:277–9.Google Scholar
  169. Powsner SM, Barwick KW, Morrow JS, et al. Coding semantic relationships for medical bibliographic retrieval: a preliminary study. Proc SCAMC. 1987:108–12.Google Scholar
  170. Prather JC, Lobach DF, Hales JW, et al. Converting a legacy system database into relational format to enhance query efficiency. Proc SCAMC. 1995:372–6.Google Scholar
  171. Pratt AW. Automatic processing of pathology data. Journees D’Informatique Medicale. 1971:595–609.Google Scholar
  172. Pratt AW. Medicine, computers, and linguistics. In: Brown JHU, Dickson JF, editors. Biomedical engineering. New York: Academic; 1973. p. 97–140.Google Scholar
  173. Pratt AW. Medicine and linguistics. MEDINFO. 1974:5–11.Google Scholar
  174. Pratt AW. Representation of medical language data utilizing the Systemized Nomenclature of Pathology. In: Enlander D, editor. Computers in laboratory medicine. New York: Academic; 1975. p. 42–53.Google Scholar
  175. Pratt AW, Pacak M. Identification and transformation of terminal morphemes in medical English. Methods Inf Med. 1969;8:84–90.PubMedGoogle Scholar
  176. Pratt AW, Pacak M. Automatic processing of medical English. Preprint No. 11, Classification: IR 3.4. Reprinted by USHEW, NIH. 1969b.Google Scholar
  177. Price SL, Hersh WR, Olson DD, et al. SmartQuery: context-sensitive links to medical knowledge sources from the electronic patient record. Proc AMIA. 2002:627–31.Google Scholar
  178. Pryor DB, Stead WW, Hammond WE, et al. Features of TMR for a successful clinical and research database. Proc SCAMC. 1982:79–83.Google Scholar
  179. Ranum DL. Knowledge based understanding of radiology text. Proc SCAMC. 1988:141–5.Google Scholar
  180. Robinson RE. Acquisition and analysis of narrative medical record data. In Collen MF, editor. Proceedings of the Conference on Med Inform Systems. Rockville: NCHSR&D; 1970. p. 111–27.Google Scholar
  181. Robinson RE. Pathology subsystem. In: Collen MF, editor. Hospital computer systems. New York: Wiley; 1974. p. 194–205.Google Scholar
  182. Robinson RE. Surgical pathology information processing system. In: Coulson WF, editor. Surgical pathology. Philadelphia: JB Lippincott; 1978. p. 1–20.Google Scholar
  183. Roper WL. From the Health Care Financing Administration. JAMA. 1989;261:1550.PubMedCrossRefGoogle Scholar
  184. Roper WL, Winkenwerder W, Hackbarth GM, Krakaur H. Effectiveness in health care; an initiative to evaluate and improve medical practice. N Engl J Med. 1988;319:1197–202.PubMedCrossRefGoogle Scholar
  185. Rothwell DJ, Cote RA. Optimizing the structure of a standardized vocabulary. Proc SCAMC. 1990:181–4.Google Scholar
  186. Rothwell DJ, Cote RA. Managing information with SNOMED: Understanding the model. Proc AMIA. 1996:80–3.Google Scholar
  187. Safran C, Porter D. New uses of a large clinical data base, Chap 7. In: Orthner HF, Blum BI, editors. Implementing health care systems. New York: Springer; 1989. p. 123–32.CrossRefGoogle Scholar
  188. Safran C, Rury C, Lightfoot J, Porter D. CLINQUERY: a program that allows physicians to search a large clinical database. Proc MEDINFO. 1989a:966–70.Google Scholar
  189. Safran C, Porter D, Lightfoot J, et al. ClinQuery: a system for online searching of data in a teaching hospital. Ann Int Med. 1989b;111:751–756Google Scholar
  190. Sager N, Hirschman L. Computerized language processing for multiple use of narrative discharge summaries. Proc SCAMC. 1978:330–43.Google Scholar
  191. Sager N, Kosaka M. A database of literature organized by relations. Proc SCAMC. 1983:692–5.Google Scholar
  192. Sager N, Tick L, Story G, Hirschman L. A codasyl-type schema for natural language medical records. Proc SCAMC. 1980:1027–33.Google Scholar
  193. Sager N, Bross IDJ, Story G, et al. Automatic encoding of clinical narrative. Comput Biol Med. 1982a;12:43–55.PubMedCrossRefGoogle Scholar
  194. Sager N, Chi EC, Tick LJ, Lyman M. Relational database design for computer-analyzed medical narrative. Proc SCAMC. 1982b:797–804.Google Scholar
  195. Sager N, Friedman C, Lyman MS, et al. The analysis and process of clinical narrative. Proc MEDINFO. 1986:1101–5.Google Scholar
  196. Sager N, Lyman M, Bucknall C, et al. Natural language processing and the representation of clinical data. JAMIA. 1994;1:142–60.PubMedGoogle Scholar
  197. Sager N, Nhan NT, Lyman M, Tick LJ. Medical language processing with SGML display. Proc AMIA. 1996:547–51.Google Scholar
  198. Schoch NA, Sewell W. The many faces of natural language searching. Proc AMIA. 1995:914.Google Scholar
  199. Seol YH, Johnson HB, Cimino JJ. Conceptual guidelines in information retrieval. Proc AMIA. 2001:1026.Google Scholar
  200. Shapiro AR. Exploratory analysis of the medical record. Proc SCAMC. 1982:781–5.Google Scholar
  201. Shusman DJ, Morgan MM, Zielstorff R, Barnett GO. The medical query language. Proc SCAMC. 1983:742–5.Google Scholar
  202. Sim I, Carini S, Tu S, Wynden R, et al. The human studies database project: Federating human studies design data using the ontology of clinical research. Proc AMIA CRI. 2010:51–5.Google Scholar
  203. Smith JW, Svirbely JR. Laboratory information systems. MD Comput. 1988;5:38–47.PubMedGoogle Scholar
  204. Spackman KA. Rates of change in a large clinical terminology: three years experience with SNOMED clinical terms. Proc AMIA. 2005:714–8.Google Scholar
  205. Spackman KA, Hersh WR. Recognizing noun phrases in medical discharge summaries: an evaluation of two natural language parsers. Proc AMIA. 1996:155158.Google Scholar
  206. Stearns MQ, Price C, Spackman KA, Wang AY. SNOMED clinical terms: overview of the development process and project status. Proc AMIA. 2001:662–6.Google Scholar
  207. Story G, Hirschman L. Data base design for natural language medical data. J Med Syst. 1982;6:77–88.PubMedCrossRefGoogle Scholar
  208. Tatch D. Automatic encoding of medical diagnoses. Proc 6th IBM Med Symp. Poughkeepsie;1964:1–7.Google Scholar
  209. Thompson HK, Baker WR, Christopher TG, et al. CLINFO, a research data management and analysis system acceptable to physician users. Proc SCAMC. 1977:140–2.Google Scholar
  210. Tuttle MS, Campbell KE, Olson NE, et al. Concept, code, term and word: preserving the distinctions. Proc AMIA. 1995:956.Google Scholar
  211. Wang X, Chused A, Elhadad N, et al. Automated knowledge acquisition from clinical narrative reports. Proc AMIA Symp. 2008:783–7.Google Scholar
  212. Wang L, Wang G, Shi X, et al. User experience evaluation of Google search for obtaining medical knowledge: a case study. Proc AMIA STB. 2010:120.Google Scholar
  213. Ward RE, MacWilliam CH, Ye E, et al. Development and multi-institutional implementation of coding and transmission standards for health outcomes data. Proc AMIA. 1996:438–42.Google Scholar
  214. Ware H, Mullett CJ, Jagannathan V. Natural language processing framework to assess clinical conditions. J Am Med Inform Assoc. 2009;16:585–9.PubMedCrossRefGoogle Scholar
  215. Warner HR, Guo D, Mason C, et al. Enroute toward a computer based patient record: the ACIS project. Proc AMIA. 1995:152–6.Google Scholar
  216. Webster S, Morgan M, Barnett GO. Medical Query Language: improved access to MUMPS databases. Proc SCAMC. 1987:306–9.Google Scholar
  217. Wells AH. The conversion of SNOP to the computer languages of medicine. Pathologists. 1971;25:371–8.Google Scholar
  218. Weyl S, Fries J, Wiederhold G, Germano F. A modular self-describing clinical databank system. Comput Biomed Res. 1975;8:279–93.PubMedCrossRefGoogle Scholar
  219. Whitehead SF, Streeter M. CLINFO – a successful technology transfer. Proc SCAMC. 1984:557–60.Google Scholar
  220. Whiting-O’Keefe Q, Strong PC, Simborg DW. An automated system for coding data from summary time oriented record (STOR). Proc SCAMC. 1983:735–7.Google Scholar
  221. Williams GZ, Williams RL. Clinical laboratory subsystem. In: Collen MF, editor. Hospital computer systems. New York: Wiley; 1974. p. 148–93.Google Scholar
  222. Wynden R. Providing a high security environment for the Integrated Data Repository lead institution. Proc AMIA STB. 2010:123.Google Scholar
  223. Wynden R, Weiner MG, Sim I, et al. Ontology mapping and data discovery for the translational investigator. Proc AMIA STB. 2010:66–70.Google Scholar
  224. Xu H, Friedman C. Facilitating research in pathology using natural language processing. Proc AMIA. 2003:1057Google Scholar
  225. Yang H, Spasic I, Keane JA, Nenadic G. A text mining approach to the prediction of disease status from clinical discharge summaries. J Am Med Inform Assoc. 2009;16:596–600.PubMedCrossRefGoogle Scholar
  226. Yianilos PN, Harbort RA, Buss SR, Tuttle EP. The application of a pattern matching algorithm to searching medical record text. Proc SCAMC. 1978:308–13.Google Scholar
  227. Zacks MP, Hersh WR. Developing search strategies for detecting high quality reviews in a hypertext test collection. Proc AMIA. 1998:663–7.Google Scholar
  228. Zeng Q, Cimino JJ. Mapping medical vocabularies to the Unified Medical Language System. Proc AMIA. 1996:105–9.Google Scholar
  229. Zeng Q, Cimino JJ. Evaluation of a system to identify relevant patient information and its impact on clinical information retrieval. Proc AMIA. 1999:642–6.Google Scholar
  230. Zhang G, Siegler T, Saxman P, et al. VISAGE: A query interface for clinical research. Proc AMIA CRI. 2010:76–80.Google Scholar
  231. Zhou L, Tao Y, Cimino JJ, et al. Terminal model discovery using natural language processing and visualization techniques. J Biomed Inform. 2006;39:626–36.PubMedCrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Limited 2012

Authors and Affiliations

  • Morris F. Collen
    • 1
  1. 1.Division of ResearchOaklandUSA

Personalised recommendations