Skip to main content

A Literature Review of Social Media-Based Data Mining for Health Outcomes Research

  • Chapter
  • First Online:

Abstract

Patient-generated health outcomes data are health outcomes created, recorded, gathered, or inferred by or from patients or their caregivers to address a health concern. A critical mass of patient-generated health outcome data has been accumulated on social media websites, which can offer a new potential data source for health outcomes research, in addition to electronic medical records (EMR), claims databases, the FDA Adverse Event Reporting System (FAERS), and survey data. Using the PubMed search engine, we systematically reviewed emerging research on mining patient-generated health outcomes in social media data to understand how this data and state-of-the-art text analysis techniques are utilized, as well as their related opportunities and challenges. We identified 19 full-text articles as the typical examples on this topic since 2011, indicating its novelty. The most analyzed health outcome was side effects due to medication (in 15 studies), while the most common methods to preprocess unstructured social media data were named entity recognition, normalization, and text mining-based feature construction. For analysis, researchers adopted content analysis, hypothesis testing, and machine learning models. When compared to EMR, claims, FAERS, and survey data, social media data comprise a large volume of information voluntarily contributed by patients not limited to one geographic location. Despite possible limitations, patient-generated health outcomes data from social media might promote further research on treatment effectiveness, adverse drug events, perceived value of treatment, and health-related quality of life. The challenge lies in the further improvement and customization of text mining methods.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Health Research Institute (PricewaterhouseCoopers). Social media ‘likes’ healthcare—from marketing to social business. 2012. https://www.pwc.com/us/en/health-industries/health-research-institute/publications/health-care-social-media.html. Accessed 10 Nov 2017.

  2. Yang CC, Yang H, Jiang L, Zhang M. Social media mining for drug safety signal detection. In: Proceedings of the 2012 International Workshop on Smart Health and Wellbeing, Maui, HI; 2012. p. 33–40.

    Google Scholar 

  3. Yates A, Goharian N. ADRTrace: detecting expected and unexpected adverse drug reactions from user reviews on social media sites. In: Proceedings of the 35th European Conference on Information Retrieval, Moscow, Russia; 2013. p. 816–9.

    Google Scholar 

  4. Lowe HJ, Barnett GO. Understanding and using the medical subject headings (MeSH) vocabulary to perform literature searches. J Am Med Assoc. 1994;271:1103–8.

    Article  CAS  Google Scholar 

  5. Pagoto SL, Waring ME, Schneider KL, Oleski JL, Olendzki E, Hayes RB, et al. Twitter-delivered behavioral weight-loss interventions: a pilot series. JMIR Res Protoc. 2015;4:e123.

    Article  Google Scholar 

  6. Ramo DE, Liu H, Prochaska JJ. A mixed-methods study of young adults’ receptivity to using Facebook for smoking cessation: if you build it, will they come? Am J Health Promot. 2015;29:e126–35.

    Article  Google Scholar 

  7. O’Brien S, Duane B. Delivery of information to orthodontic patients using social media. Evid Based Dent. 2017;18:59–60.

    Article  Google Scholar 

  8. Lofters AK, Slater MB, Angl EN, Leung FH. Facebook as a tool for communication, collaboration, and informal knowledge exchange among members of a multisite family health team. J Multidiscip Healthc. 2016;9:29–34.

    Article  Google Scholar 

  9. Chen M, Mangubat E, Ouyang B. Patient-reported outcome measures for patients with cerebral aneurysms acquired via social media: data from a large nationwide sample. J Neurointerv Surg. 2016;8:42–6.

    Article  Google Scholar 

  10. Curtis JR, Chen L, Higginbotham P, Nowell WB, Gal-Levy R, Willig J, et al. Social media for arthritis-related comparative effectiveness and safety research and the impact of direct-to-consumer advertising. Arthritis Res Ther. 2017;19:48.

    Article  Google Scholar 

  11. Hughes S, Lacasse J, Fuller RR, Spaulding-Givens J. Adverse effects and treatment satisfaction among online users of four antidepressants. Psychiatry Res. 2017;255:78–86.

    Article  CAS  Google Scholar 

  12. Egan KG, Israel JS, Ghasemzadeh R, Afifi AM. Evaluation of migraine surgery outcomes through social media. Plast Reconstr Surg Glob Open. 2016;4:e1084.

    Article  Google Scholar 

  13. Eshleman R, Singh R. Leveraging graph topology and semantic context for pharmacovigilance through twitter-streams. BMC Bioinformatics. 2016;17:335.

    Article  Google Scholar 

  14. Sullivan R, Sarker A, O’Connor K, Goodin A, Karlsrud M, Gonzalez G. Finding potentially unsafe nutritional supplements from user reviews with topic. Pac Symp Biocomput. 2016;21:528–39.

    PubMed  PubMed Central  Google Scholar 

  15. Powell GE, Seifert HA, Reblin T, Burstein PJ, Blowers J, Menius JA, et al. Social media listening for routine post-marketing safety surveillance. Drug Saf. 2016;39:443–54.

    Article  Google Scholar 

  16. Duh MS, Cremieux P, Audenrode MV, Vekeman F, Karner P, Zhang H, et al. Can social media data lead to earlier detection of drug-related adverse events? Pharmacoepidemiol Drug Saf. 2016;25:1425–33.

    Article  CAS  Google Scholar 

  17. Whitman CB, Reid MW, Arnold C, Patel H, Ursos L, Sa’adon R, et al. Balancing opioid-induced gastrointestinal side effects with pain management: insights from the online community. J Opioid Manag. 2015;11:383–91.

    Article  Google Scholar 

  18. Liu X, Chen H. A research framework for pharmacovigilance in health social media: identification and evaluation of patient adverse drug event reports. J Biomed Inform. 2015;58:268–79.

    Article  Google Scholar 

  19. Nikfarjam A, Sarker A, O’Connor K, Ginn R, Gonzalez G. Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features. J Am Med Inform Assoc. 2015;22:671–81.

    PubMed  PubMed Central  Google Scholar 

  20. Sarker A, Gonzalez G. Portable automatic text classification for adverse drug reaction detection via multi-corpus training. J Biomed Inform. 2015;53:196–207.

    Article  Google Scholar 

  21. Yang M, Kiang M, Shang W. Filtering big data from social media—building an early warning system for adverse drug reactions. J Biomed Inform. 2015;54:230–40.

    Article  Google Scholar 

  22. Carbonell P, Mayer MA, Bravo A. Exploring brand-name drug mentions on Twitter for pharmacovigilance. Stud Health Technol Inform. 2015;210:55–9.

    PubMed  Google Scholar 

  23. de Barra M, Eriksson K, Strimling P. How feedback biases give ineffective medical treatments a good reputation. J Med Internet Res. 2014;16:e193.

    Article  Google Scholar 

  24. Wicks P, Sulham KA, Gnanasakthy A. Quality of life in organ transplant recipients participating in an online transplant community. Patient. 2014;7:73–84.

    Article  Google Scholar 

  25. Wu H, Fang H, Stanhope SJ. Exploiting online discussions to discover unrecognized drug side effects. Methods Inf Med. 2013;52:152–9.

    Article  CAS  Google Scholar 

  26. Frost J, Okun S, Vaughan T, Heywood J, Wicks P. Patient-reported outcomes as a source of evidence in off-label prescribing: analysis of data from PatientsLikeMe. J Med Internet Res. 2011;13:e6.

    Article  Google Scholar 

  27. Freedman RA, Viswanath K, Vaz-Luis I, Keating NL. Learning from social media: utilizing advanced data extraction techniques to understand barriers to breast cancer treatment. Breast Cancer Res Treat. 2016;158:395–405.

    Article  Google Scholar 

  28. Mao JJ, Chung A, Benton A, Hill S, Ungar L, Leonard CE, et al. Online discussion of drug side effects and discontinuation among breast cancer survivors. Pharmacoepidemiol Drug Saf. 2013;22:256–62.

    Article  Google Scholar 

  29. Ru B, Harris K, Yao L. A content analysis of patient-reported medication outcomes on social media. In: Proceedings of IEEE 15th International Conference on Data Mining Workshops, Atlantic City, NJ; 2015. p. 472–9.

    Google Scholar 

  30. Lalwani AK. Negativity and positivity biases in product evaluations: the impact of consumer goals and prior attitudes: ProQuest. 2006.

    Google Scholar 

  31. National Library of Medicine (US). UMLS® Reference Manual. https://www.ncbi.nlm.nih.gov/books/NBK9676/. Accessed 10 Nov 2017.

  32. FDA. FDA Adverse Event Reporting System (FAERS). 2008. http://www.fda.gov/Drugs/GuidanceComplianceRegulatoryInformation/Surveillance/AdverseDrugEffects/. Accessed 5 Nov 2017.

  33. Kuhn M, Letunic I, Jensen LJ, Bork P. The SIDER database of drugs and side effects. Nucleic Acids Res. 2016;44:D1075–9.

    Article  CAS  Google Scholar 

  34. Bengio Y, Ducharme R, Vincent P, Jauvin C. A neural probabilistic language model. J Mach Learn Res. 2003;3:1137–55.

    Google Scholar 

  35. Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C. Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, Portland, OR; 2011. p. 142–50.

    Google Scholar 

  36. Chapman A. Bag of Words Meets Bags of Popcorn. 2014. https://www.kaggle.com/c/word2vec-nlp-tutorial. Accessed 10 Nov 2017.

  37. Chen D, Manning C. A fast and accurate dependency parser using neural networks. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), Doha, Qatar; 2014. p. 740–50.

    Google Scholar 

  38. Blei DM, Ng AY, Jordan MI. Latent dirichlet allocation. J Mach Learn Res. 2003;3:993–1022.

    Google Scholar 

  39. Wilson J. White Paper: the benefit of using both claims data and electronic medical record data in health care analysis. 2014. https://www.optum.com/resources/library/benefit-using-both-claims-data-electronic-medical-record-data-health-care-analysis.html. Accessed 10 Nov 2017.

  40. Rowley R. Claims data: the good, the bad and the ugly. 2014. https://flowhealth.com/blog/2014/05/claims-data-the-good-the-bad-and-the-ugly/. Accessed 10 Nov 2017.

  41. Strom BL. How the us drug safety system should be changed. J Am Med Assoc. 2006;295:2072–5.

    Article  CAS  Google Scholar 

  42. About IEEE Xplore. https://ieeexplore.ieee.org. Accessed 20 June 2018.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lixia Yao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Ru, B., Yao, L. (2019). A Literature Review of Social Media-Based Data Mining for Health Outcomes Research. In: Bian, J., Guo, Y., He, Z., Hu, X. (eds) Social Web and Health Research. Springer, Cham. https://doi.org/10.1007/978-3-030-14714-3_1

Download citation

Publish with us

Policies and ethics