Identifying Service Gaps from Public Patient Opinions Through Text Mining

  • Min Tang
  • Yiping LiuEmail author
  • Zhiguo Li
  • Ying Liu
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 924)


Nowadays, healthcare systems have become increasingly patient-centered and the unstructured, open-ended and patient-driven feedback has drawn a significant attention from medical and healthcare organizations. Based on this, we are motivated to harness various machine learning algorithms to process such a large amount of unstructured comments posted on public patient opinion sites. We first used sentiment analysis to automatically predict the concerns of patients from the training set which was already labelled. Then, with the help of the clustering, we extracted the hot topics related to a specific domain to reflect the service issues that patients concern most. Through experimental studies, the performance of different algorithms and the influence of different parameter were compared. Finally, refering to the survey and previous studies, the results were analyzed to obtain the conclusions.


Text mining Sentiment analysis Clustering analysis Public health service 


  1. 1.
    Greaves, F., Millett, C.: Consistently increasing numbers of online ratings of healthcare in England. J. Med. Internet Res. 14(3), e94 (2012)CrossRefGoogle Scholar
  2. 2.
    Tumasjan, A.: Predicting elections with Twitter: what 140 characters reveal about political sentiment. In: Fourth International AAAI Conference on Weblogs and Social Media, Washington DC, pp. 178–185 (2010)Google Scholar
  3. 3.
    Zimlichman, E., Levin-Scherz, J.: The coming golden age of disruptive innovation in health care. J. Gen. Intern. Med. 28, 865–867 (2013)CrossRefGoogle Scholar
  4. 4.
    Ziegler, C., Skubacz, M., Viermetz, M.: Mining and exploring unstructured customer feedback data using language models and treemap visualizations. In: IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, pp. 932–937. IEEE, Sydney (2008)Google Scholar
  5. 5.
    Ginsberg, J.: Detecting influenza epidemics using search engine query data. Nature 457, 1012–1014 (2008)CrossRefGoogle Scholar
  6. 6.
    Freifeld, C.C.: HealthMap: global infectious disease monitoring through automated classification and visualization of internet media reports. J. Med. Res. 15, 150–157 (2008)Google Scholar
  7. 7.
    Greaves, F., et al.: Use of sentiment analysis for capture patient experience from free-text comments posted online. J. Med. Internet Res. 15(11), e239 (2014)CrossRefGoogle Scholar
  8. 8.
    Lin, Y., et al.: A document clustering and ranking system for exploring MEDLINE citations. J. Am. Med. Inform. Assoc. 14, 651–661 (2007)CrossRefGoogle Scholar
  9. 9.
    Denecke, K., Nejdl, W.: How valuable is medical social media data? Content analysis of the medical web. Inf. Sci. 179, 1870–1880 (2009)CrossRefGoogle Scholar
  10. 10.
    Pang, B., Lee, L.: Opinion mining and sentiment analysis found. Trends Inf. Retr. 2(1–2), 1–138 (2008)CrossRefGoogle Scholar
  11. 11.
    Ivanciue, O.: Weka machine learning for predicting the phospholipidosis including potential. Curr. Top. Med. Chem. 8(18), 1691–1709 (2008)CrossRefGoogle Scholar
  12. 12.
    Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann Publishers, San Francisco (2005)zbMATHGoogle Scholar
  13. 13.
    Frank, E., et al.: Data mining in bioinformatics using Weka. Bioinformatics 20(15), 2479–2481 (2004)CrossRefGoogle Scholar
  14. 14.
    Li, J., et al.: Discovery of significant rules for classifying cancer diagnosis data. Bioinformatics 19(Suppl. 2), 1193–2103 (2003)Google Scholar
  15. 15.
    Alemi, F., et al.: Feasibility of real-time satisfaction surveys through automated analysis of patients’ unstructured comments and sentiments. Qual. Manag. Health Care 21(1), 9–19 (2012)CrossRefGoogle Scholar
  16. 16.
    Abegaz, T., Dillon, E., Gilbert, J.E.: Exploring affective reaction during user interaction with colors and shapes. Proc. Manuf. 3(Suppl. C), 5253–5260 (2015)Google Scholar
  17. 17.
    Dong, A., Lovallo, D., Mounarath, R.: The effect of abductive reasoning on concept selection decisions. Des. Stud. 37(Suppl. C), 37–58 (2015)CrossRefGoogle Scholar
  18. 18.
    Evans, P.: From deconstruction to big data: how technology is reshaping the corporation. MIT Technol. Rev. (2015). Stanford, CaliforniaGoogle Scholar
  19. 19.
    Hsu, F.-C., Lin, Y.-H., Chen, C.-N.: Applying cluster analysis for consumer’s affective responses toward product forms. J. Interdiscip. Math. 18(6), 657–666 (2015)CrossRefGoogle Scholar
  20. 20.
    Chen, R., Xu, W.: The determinants of online customer ratings: a combined domain ontology and topic text analytics approach. Electron. Commer. Res. 17(1), 31–50 (2017)CrossRefGoogle Scholar
  21. 21.
    Holy, V., Sokol, O., Cerny, M.: Clustering retail products based on customer behaviour. Appl. Soft Comput. 60(Suppl. C), 752–762 (2017)CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  1. 1.School of ManagementChongqing Technology and Business UniversityChongqingChina
  2. 2.School of EngineeringCardiff UniversityCardiffUK

Personalised recommendations