Skip to main content
Log in

Data Mining in Pharmacovigilance

The Need for a Balanced Perspective

  • Current Opinion
  • Published:
Drug Safety Aims and scope Submit manuscript

Abstract

Data mining is receiving considerable attention as a tool for pharmacovigilance and is generating many perspectives on its uses. This paper presents four concepts that have appeared in various professional venues and represent potential sources of misunderstanding and/or entail extended discussions: (i) data mining algorithms are unvalidated; (ii) data mining algorithms allow data miners to objectively screen spontaneous report data; (iii) mathematically more complex Bayesian algorithms are superior to frequentist algorithms; and (iv) data mining algorithms are not just for hypothesis generation. Key points for a balanced perspective are that: (i) validation exercises have been done but lack a gold standard for comparison and are complicated by numerous nuances and pitfalls in the deployment of data mining algorithms. Their performance is likely to be highly situation dependent; (ii) the subjective nature of data mining is often underappreciated; (iii) simpler data mining models can be supplemented with ‘clinical shrinkage’, preserving sensitivity; and (iv) applications of data mining beyond hypothesis generation are risky, given the limitations of the data. These extended applications tend to ‘creep’, not pounce, into the public domain, leading to potential overconfidence in their results. Most importantly, in the enthusiasm generated by the promise of data mining tools, users must keep in mind the limitations of the data and the importance of clinical judgment and context, regardless of statistical arithmetic. In conclusion, we agree that contemporary data mining algorithms are promising additions to the pharmacovigilance toolkit, but the level of verification required should be commensurate with the nature and extent of the claimed applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Table I

Similar content being viewed by others

References

  1. Hand DJ, Blunt G, Kelly M, et al. Data mining for fun and profit. Stat Sci 2000; 15: 111–31

    Article  Google Scholar 

  2. Evans SJ, Waller PC, Davis S. Use of proportional reporting ratios (PRRs) for signal generation from spontaneous adverse drug reaction reports. Pharmacoepidemiol Drug Saf 2001; 10(6): 483–6

    Article  PubMed  CAS  Google Scholar 

  3. Van Puijenbroek E, Diemont W, van Grootheest K. Application of quantitative signal detection in the Dutch spontaneous reporting system for adverse drug reactions. Drug Saf 2003; 26(5): 293–301

    Article  PubMed  Google Scholar 

  4. Szarfman A, Machado SG, O’Neill RT. Use of screening algorithms and computer systems to efficiently signal higher-thanexpected combinations of drugs and events in the US FDA’s spontaneous reports database. Drug Saf 2002; 25(6): 381–92

    CAS  Google Scholar 

  5. Bate A, Lindquist M, Edwards IR, et al. A Bayesian neural network method for adverse drug reaction signal generation. Eur J Clin Pharmacol 1998; 54(4): 315–21

    Article  PubMed  CAS  Google Scholar 

  6. Bate A, Lindquist M, Orre R, et al. Data-mining analyses of pharmacovigilance signals in relation to relevant comparison drugs. Eur J Clin Pharmacol 2002; 58(7): 483–90

    Article  PubMed  CAS  Google Scholar 

  7. Wilson AM, Thabane L, Holbrook A. Application of data mining techniques in pharmacovigilance. Br J Clin Pharmacol 2004; 57(2): 127–34

    Article  PubMed  Google Scholar 

  8. DuMouchel W. Bayesian data mining in large frequency tables, with an application to the FDA spontaneous reporting system. Am Stat 1999; 53(3): 170–90

    Google Scholar 

  9. Hauben M. A brief primer on automated signal detection. Ann Pharmacother 2003; 37(7-8): 1117–23

    Article  PubMed  Google Scholar 

  10. Wang C. Amoeba regression and time series models. In: Sense and nonsense of statistical inference: controversy misuse and subtlety. New York: Marcel Dekker, 1993: 72–97

    Google Scholar 

  11. Lilienfeld DE. A challenge to the data miners. Pharmacoepidemiol Drug Saf 2004; 13(12): 881–4

    Article  PubMed  Google Scholar 

  12. Hauben M, Zhou X. Quantitative methods in pharmacovigilance: focus on signal detection. Drug Saf 2003; 26(3): 159–86

    Article  PubMed  Google Scholar 

  13. Kiyoshi K, Daisuke K, Toshiki H. Comparison of data mining methodologies using Japanese spontaneous reports. Pharmacoepidemiol Drug Saf 2004; 13(6): 387–94

    Article  Google Scholar 

  14. Lindquist M, Stahl M, Bate A, et al. A retrospective evaluation of a data mining approach to aid finding new adverse drug reaction signals in the WHO international database. Drug Saf 2000; 23(6): 533–42

    Article  PubMed  CAS  Google Scholar 

  15. Van Puijenbroek EP, Bate A, Leufkens HG, et al. A comparison of measures of disproportionality for signal detection in spontaneous reporting systems for adverse drug reactions. Drug Saf 2002; 11(1): 3–10

    Google Scholar 

  16. Emmanuael R, Tubert-Bitter P, Thiessard F. Evaluation of data mining methods in pharmacovigilance using simulated datasets. Poster presentation at 20th ICPE conference; Bordeaux, France 2004

    Google Scholar 

  17. Follmann M, Michel A. Proportional reporting rations for signal detection in the drug safety database of a pharmaceutical company. Poster presentation at 19th ICPE conference; Philadelphia, USA 2003

    Google Scholar 

  18. Follmann M, Michel A, Geyer C. Comparison of different methods for signal detection in the drug safety database of a pharmaceutical company. Poster presentation at 20th ICPE conference; Bordeaux, France 2004

    Google Scholar 

  19. Yukari K, Eri K, Moriko K. The impact of grouping drugs by ATC codes on detecting a signal from Japanese spontaneous reports. Poster presentation at ICPE Conference; Bordeaux, France 2004

    Google Scholar 

  20. Hauben M, Walsh L, Reich L. Predictive value of a computerized signal detection algorithm (MGPs) when applied to FDA AERS database [abstract]. Pharmacoepidemiol Drug Saf 2005; 14(S1-S218): S17 (no.35)

    Google Scholar 

  21. Hauben M. Application of an empiric Bayesian data mining algorithm to reports of pancreatitis associated with atypical antipsychotics. Pharmacotherapy 2004; 24(9): 1122–9

    Article  PubMed  CAS  Google Scholar 

  22. Trontell A. Expecting the unexpected: drug safety, pharmacovigilance, and the prepared mind. N Engl J Med 2004; 351(14): 1385–7

    Article  PubMed  CAS  Google Scholar 

  23. Wang C. Objectivity, subjectivity, and probability. In: Sense and nonsense of statistical inference: controversy misuse and subtlety. New York: Marcel Dekker, 1993: 137–184

    Google Scholar 

  24. Hand DJ. Presentation at joint workshop on statistical data mining; 2003; Eindhoven, The Netherlands, 2003

    Google Scholar 

  25. Hauben M, Reich L. Safety related drug-labelling changes: findings from two data mining algorithms. Drug Saf 2004; 27(10): 735–44

    Article  PubMed  Google Scholar 

  26. Hauben M, Reich L. Drug-induced pancreatitis: lessons in data mining. Br J Clin Pharmacol 2004; 58(5): 560–2

    Article  PubMed  Google Scholar 

  27. Moseley J, Heeley E, Ekins-Daukes S, et al. Preliminary comparison of 2 signal detection methodologies in the UK regulatory authority spontaneous ADR database. Drug Saf 2004; 27(12): 950–1

    Google Scholar 

  28. Hauben M, Reich L, Gerrits C. Comparative performance of proportional reporting ratios (PRR) and multi-item Gamma Poisson shrinker (MGPS) for the identification of crystalluria and urinary tract calculi caused by drugs [abstract]. Pharmacoepidemiol Drug Saf 2005; 14(S1-S218): S7

    Google Scholar 

  29. Strom BL. Evaluation of suspected adverse drug reactions. JAMA 2005; 293(11): 1324–5

    Article  CAS  Google Scholar 

  30. Almenoff JS, DuMouchel W, Kindman LA, et al. Disproportionality analysis using empirical Bayes data mining: a tool for the evaluation of drug interactions in the post-marketing setting. Pharmacoepidemiol Drug Saf 2003; 12(6): 517–21

    Article  PubMed  Google Scholar 

  31. PhRMA. Request for proposal: validity and value of data mining methods as an adjunct to traditional methods for detecting safety signals from spontaneous reporting databases [online]. Available from URL: http://www.phrma.org/publications/publications7Data_Mining_RFP.pdf [Accessed 2005 Feb 5]

Download references

Acknowledgements

No sources of funding were used to assist in the preparation of this review. The authors have no conflicts of interest that are directly relevant to the content of this review.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lester Reich.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hauben, M., Patadia, V., Gerrits, C. et al. Data Mining in Pharmacovigilance. Drug-Safety 28, 835–842 (2005). https://doi.org/10.2165/00002018-200528100-00001

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.2165/00002018-200528100-00001

Keywords

Navigation