Predicting Adverse Drug Events by Analyzing Electronic Patient Records

  • Isak Karlsson
  • Jing Zhao
  • Lars Asker
  • Henrik Boström
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7885)


Diagnosis codes for adverse drug events (ADEs) are sometimes missing from electronic patient records (EPRs). This may not only affect patient safety in the worst case, but also the number of reported ADEs, resulting in incorrect risk estimates of prescribed drugs. Large databases of electronic patient records (EPRs) are potentially valuable sources of information to support the identification of ADEs. This study investigates the use of machine learning for predicting one specific ADE based on information extracted from EPRs, including age, gender, diagnoses and drugs. Several predictive models are developed and evaluated using different learning algorithms and feature sets. The highest observed AUC is 0.87, obtained by the random forest algorithm. The resulting model can be used for screening EPRs that are not, but possibly should be, assigned a diagnosis code for the ADE under consideration. Preliminary results from using the model are presented.


machine learning electronic patient records adverse drug events 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Howard, R.L., Avery, A.J., Slavenburg, S., Royal, S., Pipe, G., Lucassen, P., Pirmohamed, M.: Which drugs cause preventable admissions to hospital? A systematic review. British Journal of Clinical Pharmacology 63, 136–147 (2007)CrossRefGoogle Scholar
  2. 2.
    Hazell, L., Shakir, S.A.: Under-reporting of adverse drug reactions: a systematic review. Drug Safet. 29(5), 385–396 (2006)CrossRefGoogle Scholar
  3. 3.
    Steinman, M.A., Rosenthal, G.E., Landefeld, C.S., Bertenthal, D., Kaboli, P.J.: Agreement between drugs-to-avoid criteria and expert assessments of problematic prescribing. Arch. Intern. Med. 169(14), 1326–1332 (2009)CrossRefGoogle Scholar
  4. 4.
    The Uppsala Monitoring Center,
  5. 5.
    Hazlehurst, B., Naleway, A., Mullooly, J.: Detecting possible vaccine adverse events in clinical notes from the electronic medical record. Vaccine 27, 2077–2083 (2009)CrossRefGoogle Scholar
  6. 6.
    Vilar, S., Harpaz, R., Santana, L., Uriarte, E., Friedman, C.: Enhancing Adverse Drug Event Detection in Electronic Health Records Using Molecular Structure Similarity: Application to Pancreatitis. PLoS ONE 7(7) (2012)Google Scholar
  7. 7.
    Dalianis, H., Hassel, M., Henriksson, A., Skeppstedt, M.: Stockholm EPR Corpus: A Clinical Database Used to Improve Health Care. In: Swedish Language Technology Conference (2012)Google Scholar
  8. 8.
    Brieman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)CrossRefGoogle Scholar
  9. 9.
    Boström, H.: Concurrent learning of large-scale random forests. In: Proceedings of Scandinavian Conference on Artificial Intelligence, pp. 20–29 (2011)Google Scholar
  10. 10.
    Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann (2005)Google Scholar
  11. 11.
    Bellazzi, R., Zupan, B.: Predictive data mining in clinical medicine: current issues and guidelines. International Journal of Medical Informatics 77(2), 81–97 (2008)CrossRefGoogle Scholar
  12. 12.
    Bradley, A.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30(7), 1145–1159 (1997)CrossRefGoogle Scholar
  13. 13.
    Huang, J., Ling, C.X.: Using AUC and accuracy in evaluating learning algorithms. IEEE Transactions on Knowledge and Data Engineering 17(3), 299–310 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Isak Karlsson
    • 1
  • Jing Zhao
    • 1
  • Lars Asker
    • 1
  • Henrik Boström
    • 1
  1. 1.Dept. of Computer and Systems SciencesStockholm UniversityKistaSweden

Personalised recommendations