Skip to main content

Learning Wellness Profiles of Users on Social Networks: The Case of Diabetes

  • Chapter
  • First Online:
  • 696 Accesses

Abstract

The increasing popularity of social media has encouraged health consumers to share, explore, and validate health and wellness information on social networks, which provide a rich repository of Patient-Generated Wellness Data (PGWD). While data-driven healthcare has attracted a lot of attention from academia and industry for improving care delivery through personalized healthcare, limited research has been done on harvesting and utilizing PGWD available on social networks. This chapter focuses on wellness profiling of users where we demonstrate algorithms to effectively harvest social media to extract wellness information of individuals as well as construct the latent profile of users. In particular, we study the wellness profile of users in diabetes, with extension to obesity and depression.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://www.pewinternet.org/2010/03/24/social-media-and-health/

  2. 2.

    http://www.cdc.gov/chronicdisease/overview/

  3. 3.

    https://www.un.org/en/ga/ncdmeeting2011/

  4. 4.

    This is a real tweet from the dataset.

  5. 5.

    Note that the numbers in Table 8.2 do not add up to 11; 217 since our dataset is a multi-label dataset meaning that some messages discuss about more than one PWE.

  6. 6.

    In this text, we use wellness feature (e.g., blood glucose, hypertension) and wellness events (onset of asthma attack, hyperglycemia) interchangeably.

  7. 7.

    We followed a bootstrapping approach similar to [68] to ensure the coverage and diversity of used patterns, where all extracted patterns are manually verified to ensure accuracy.

  8. 8.

    In our dataset, there are three non-common diabetes types: gestational diabetes, diabetes LADA (Type 1.5), and diabetes insipidus.

  9. 9.

    Due to user privacy concerns, some words/sentences may be different from original version.

  10. 10.

    We did not consider the first week of May and the last week of October because the data was partially crawled.

  11. 11.

    http://instagram.com

  12. 12.

    http://endomondo.com

References

  1. UN General Assembly. Political declaration of the high-level meeting of the general assembly on the prevention and control of noncommunicable diseases. New York: UN; 2011.

    Google Scholar 

  2. Abbar S, Mejova Y, Weber I. You tweet what you eat: studying food consumption through twitter. In: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems: ACM; 2015. p. 3197–206.

    Google Scholar 

  3. Akbari M, Chua T-S. Leveraging behavioral factorization and prior knowledge for community discovery and profiling. In: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining: ACM; 2017. p. 71–9.

    Google Scholar 

  4. Akbari M, Hu X, Nie L, Chua T-S.. From tweets to wellness: wellness event detection from twitter streams. In: AAAI; 2016.

    Google Scholar 

  5. Akbari M, Hu X, Nie L, Chua T-S. Towards organizing health knowledge on community-based health services. EURASIP J Bioinforma Syst Biol. 2016;2016(1):18.

    Article  Google Scholar 

  6. Akbari M, Hu X, Wang F, Chua T-S. Wellness representation of users in social media: towards joint modelling of heterogeneity and temporality. IEEE Trans Knowl Data Eng. 2017;

    Google Scholar 

  7. Akbari M, Nie L, Chua T-S. amm: towards adaptive ranking of multi-modal documents. International J Multimed Inf Retr. 2015;4(4):233–45.

    Article  Google Scholar 

  8. Akbari M, Relia K, Anas E, Rumi C. From the user to the medium: neural profilin across web communities. In: ICWSM; 2018.

    Google Scholar 

  9. Aronson AR. Effective mapping of biomedical text to the umls metathesaurus: the metamap program. AMIA Symp. 2001:17–21.

    Google Scholar 

  10. A. D. Association, et al. Standards of medical care in diabetes. Diabetes Care. 2014;37(Suppl 1):S14–80.

    Article  Google Scholar 

  11. Attai DJ, Cowher MS, Al-Hamadani M, Schoger JM, Staley AC, Landercasper J. Twitter social media is an effective tool for breast cancer patient education and support: patient-reported outcomes by survey. J Med Internet Res. 2015;17(7):e188.

    Article  Google Scholar 

  12. Blei DM, Ng AY, Jordan MI. Latent dirichlet allocation. J Mach Learn Res. 2003;3:993–1022.

    Google Scholar 

  13. Carlson A, Gaffney S, Vasile F. Learning a named entity tagger from gazetteers with the partial perceptron. In: AAAI Spring Symposium: Learning by Reading and Learning to Read; 2009. p. 7–13.

    Google Scholar 

  14. Che Z, Purushotham S, Khemani R, Liu Y. Distilling knowledge from deep networks with applications to healthcare domain. arXiv preprint arXiv:1512.03542. 2015.

    Google Scholar 

  15. Chen X, Pan W, Kwok JT, Carbonell JG. Accelerated gradient method for multi-task sparse learning problem. In: ICDM ‘09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining; IEEE Computer Society, Washington, DC; 2009.

    Google Scholar 

  16. Chen Y, Zhao J, Hu X, Zhang X, Li Z, Chua T-S. From interest to function: Location estimation in social media. In: AAAI; 2013.

    Google Scholar 

  17. Cohen S, Wills TA. Stress, social support, and the buffering hypothesis. Psychol Bull. 1985;98(2):310.

    Article  CAS  Google Scholar 

  18. De Choudhury M. You’re happy, i’m happy: diffusion of mood expression on twitter. In: Proceedings of HCI Korea: Hanbit Media; 2014. p. 169–79.

    Google Scholar 

  19. De Choudhury M, Gamon M, Counts S, Horvitz E. Predicting depression via social media. In: ICWSM; 2013.

    Google Scholar 

  20. De Choudhury M, Morris MR, White RW. Seeking and sharing health information online: comparing search engines and social media. In: SIGCHI; 2014.

    Google Scholar 

  21. Farseev A, Chua T-S. Tweet can be fit: Integrating data from wearable sensors and multiple social networks for wellness profile learning. ACM Trans Inf Syst. 2017;35(4):42.

    Article  Google Scholar 

  22. Field AE, Coakley EH, Must A, Spadano JL, Laird N, Dietz WH, Rimm E, Colditz GA. Impact of overweight on the risk of developing common chronic diseases during a 10-year period. Arch Intern Med. 2001;161(13):1581–6.

    Article  CAS  Google Scholar 

  23. Gupta N, Singh S. Collectively embedding multi-relational data for predicting user preferences. arXiv preprint arXiv:1504.06165. 2015.

    Google Scholar 

  24. Gupta S, Manning CD. SPIED: Stanford pattern-based information extraction and diagnostics. In: Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces. Baltimore, MD: ACL; 2014. p. 38–44.

    Google Scholar 

  25. He X, Cai D, Niyogi P. Laplacian score for feature selection. In: NIPS; 2005.

    Google Scholar 

  26. Hu FB. Globalization of diabetes: the role of diet, lifestyle, and genes. Diabetes Care. 2011;34(6):1249–57.

    Article  Google Scholar 

  27. Hu X, Liu H. Text analytics in social media. In: Mining text data. Springer; 2012.

    Google Scholar 

  28. Hu X, Tang J, Liu H. Online social spammer detection. In: AAAI; 2014.

    Google Scholar 

  29. Huang T, Elghafari A, Relia K, Chunara R. High-resolution temporal representations of alcohol and tobacco behaviors from social media data. In: Proceedings of the ACM on human-computer interaction, 1(CSCW); 2017.

    Google Scholar 

  30. Jalali A, Sanghavi S, Ruan C, Ravikumar PK. A dirty model for multi-task learning. In: NIPS; 2010.

    Google Scholar 

  31. Jalali L, Jain R. Bringing deep causality to multimedia data streams. In Proceedings of the 23rd ACM international conference on Multimedia. ACM; 2015. p. 221–230.

    Google Scholar 

  32. Jin X, Zhuang F, Pan SJ, Du C, Luo P, He Q. Heterogeneous multi-task semantic feature learning for classification. In: CIKM; 2015.

    Google Scholar 

  33. Kim S, Xing EP. Statistical estimation of correlated genome associations to a quantitative trait network. PLoS Genet. 2009;5(8):e1000587.

    Article  Google Scholar 

  34. Koren Y, Bell R, Volinsky C. Matrix factorization techniques for recommender systems. Computer. 2009;(8):30–7.

    Article  Google Scholar 

  35. Kuh D, Shlomo YB. A life course approach to chronic disease epidemiology. Number 2. Oxford: Oxford University Press; 2004.

    Book  Google Scholar 

  36. Kumar A, Daume III H. Learning task grouping and overlap in multi-task learning. ICML; 2012.

    Google Scholar 

  37. Lee DD, Seung HS. Algorithms for non-negative matrix factorization. In: NIPS; 2001.

    Google Scholar 

  38. Lee K, Agrawal A, Choudhary A. Real-time disease surveillance using twitter data: demonstration on flu and cancer. In: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining: ACM; 2013. p. 1474–7.

    Google Scholar 

  39. Li J, Ritter A, Cardie C, Hovy E. Major life event extraction from twitter based on congratulations/condolences speech acts. In: EMNLP; 2014.

    Google Scholar 

  40. Li Z, Yang Y, Liu J, Zhou X, Lu H. Unsupervised feature selection using nonnegative spectral analysis. In: AAAI; 2012.

    Google Scholar 

  41. Lim SS, Vos T, Flaxman AD, et al. A comparative risk assessment of burden of disease and injury attributable to 67 risk factors and risk factor clusters in 21 regions, , 1990-2010: a systematic analysis for the Global Burden of Disease Study 2010. Lancet. 2013;380(9859):2224–60.

    Article  Google Scholar 

  42. Lin H, Jia J, Guo Q, Xue Y, Li Q, Huang J, Cai L, Feng L. User-level psychological stress detection from social media using deep neural network. In: ACM MM; 2014.

    Google Scholar 

  43. Lin H, Jia J, Qiu J, Zhang Y, Shen G, Xie L, Tang J, Feng L, Chua T-S. Detecting stress based on social interactions in social networks. IEEE Trans Knowl Data Eng. 2017;29(9):1820–33.

    Article  Google Scholar 

  44. Liu C, Wang F, Hu J, Xiong H. Temporal phenotyping from longitudinal electronic health records: a graph based framework. In: KDD; 2015.

    Google Scholar 

  45. Liu J, Weitzman ER, Chunara R. Assessing behavioral stages from social media data. In: CSCW: proceedings of the Conference on Computer-Supported Cooperative Work. Conference on Computer-Supported Cooperative Work; 2017.

    Google Scholar 

  46. Mintz M, Bills S, Snow R, Jurafsky D. Distant supervision for relation extraction without labeled data. In: ACL; 2009.

    Google Scholar 

  47. Nesterov Y. Introductory lectures on convex optimization: a basic course. New York: Springer; 2004.

    Book  Google Scholar 

  48. Nie F, Huang H, Cai X, Ding CH. Efficient and robust feature selection via joint l2/1-norms minimization. In: NIPS; 2010.

    Google Scholar 

  49. Nie L, Akbari M, Li T, Chua T-S. A joint local-global approach for medical terminology assignment. In: SIGIR; 2014.

    Google Scholar 

  50. Nori N, Kashima H, Yamashita K, Ikai H, Imanaka Y. Simultaneous modeling of multiple diseases for mortality prediction in acute hospital care. In: KDD; 2015.

    Google Scholar 

  51. Obozinski G, Taskar B, Jordan MI. Joint covariate selection and joint subspace selection for multiple classification problems. Stat Comput. 2010;20(2):231–52.

    Article  Google Scholar 

  52. W. H. Organization et al. Global database on body mass index. Global Database on Body Mass Index; 2011.

    Google Scholar 

  53. Pan Y, Yao T, Mei T, Li H, Ngo C-W, Rui Y. Click-through-based cross-view learning for image search. In: SIGIR; 2014.

    Google Scholar 

  54. Park K, Weber I, Cha M, Lee C. Persistent sharing of fitness app status on twitter. In: Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing: ACM; 2016. p. 184–94.

    Google Scholar 

  55. Pastors JG, Warshaw H, Daly A, Franz M, Kulkarni K. The evidence for the effectiveness of medical nutrition therapy in diabetes management. Diabetes Care. 2002;25(3):608–13.

    Article  Google Scholar 

  56. Pennebaker JW, Francis ME, Booth RJ. Linguistic inquiry and word count: Liwc 2001. Mahwah, NJ: Lawrence Erlbaum Associates; 2001.

    Google Scholar 

  57. Powers DM. Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation. J Mach Learn Technol. 2011;2(1):37–63.

    Google Scholar 

  58. Relia K, Akbari M, Duncan D, Chunara R. Socio-spatial self-organizing maps: using social media to assess relevant geographies for exposure to social processes. Proceedings of the ACM on Human-Computer Interaction—CSCW; 2018.

    Google Scholar 

  59. Ritter A, Etzioni O, Clark S, et al. Open domain event extraction from twitter. In: KDD; 2012.

    Google Scholar 

  60. Robinson PN. Deep phenotyping for precision medicine. Hum Mutat. 2012;33(5):777–80.

    Article  Google Scholar 

  61. Ruvolo P, Eaton E. Online multi-task learning via sparse dictionary optimization. In: AAAI; 2014.

    Google Scholar 

  62. Shelley KJ. Developing the American time use survey activity classification system. Monthly Lab Rev. 2005;128:3.

    Google Scholar 

  63. Shen G, Jia J, Nie L, Feng F, Zhang C, Hu T, Chua T-S, Zhu W. Depression detection via harvesting social media: A multimodal dictionary learning solution. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI-17); 2017. p. 3838–44.

    Chapter  Google Scholar 

  64. Sohn S, Clark C, Halgrim SR, Murphy SP, Chute CG, Liu H. MedXN: an open source medication extraction and normalization tool for clinical text. J Am Med Inform Assoc. 2014;21(5):858–65.

    Article  Google Scholar 

  65. Song X, Nie L, Zhang L, Akbari M, Chua T-S. Multiple social network learning and its application in volunteerism tendency prediction. In: SIGIR; 2015.

    Google Scholar 

  66. Sun Z, Wang F, Hu J. Linkage: an approach for comprehensive risk prediction for care management. In: KDD; 2015.

    Google Scholar 

  67. Teodoro R, Naaman M. Fitter with twitter: Understanding personal health and fitness activity in social media. In: ICWSM; 2013.

    Google Scholar 

  68. Thelen M, Riloff E. A bootstrapping method for learning semantic lexicons using extraction pattern contexts. In: EMNLP; 2002.

    Google Scholar 

  69. Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc. 1996;58(1):267–88.

    Google Scholar 

  70. Wang C, Raina R, Fong D, Zhou D, Han J, Badros G. Learning relevance from heterogeneous social network and its application in online targeting. In: SIGIR; 2011.

    Google Scholar 

  71. Wang F, Lee N, Hu J, Sun J, Ebadollahi S, Laine AF. A framework for mining signatures from event sequences and its applications in healthcare data. IEEE Trans Pattern Anal Mach Intell. 2013;35(2):272–85.

    Article  Google Scholar 

  72. Wing C, Yang H. FitYou: integrating health profiles to real-time contextual suggestion. In: SIGIR; 2014.

    Google Scholar 

  73. Xu L, Huang A, Chen J, Chen E. Exploiting task-feature co-clusters in multi-task learning. In: AAAI; 2015.

    Google Scholar 

  74. Xu T, Sun J, Bi J. Longitudinal lasso: Jointly learning features and temporal contingency for outcome prediction. In: CIKM; 2015.

    Google Scholar 

  75. Zhao Z, Liu H. Spectral feature selection for supervised and unsupervised learning. In: ICML; 2007.

    Google Scholar 

  76. Zhou D, Chen L, He Y. An unsupervised framework of exploring events on twitter: Filtering, extraction and categorization. In: AAAI; 2015.

    Google Scholar 

  77. Zhou J, Wang F, Hu J, Ye J. From micro to macro: data driven phenotyping by densification of longitudinal electronic medical records. In: KDD; 2014.

    Google Scholar 

  78. Zhou L, Melton GB, Parsons S, Hripcsak G. A temporal constraint structure for extracting temporal information from clinical narrative. J Biomed Inform. 2006;39(4):424–39.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Akbari .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Akbari, M., Hu, X., Chua, TS. (2019). Learning Wellness Profiles of Users on Social Networks: The Case of Diabetes. In: Bian, J., Guo, Y., He, Z., Hu, X. (eds) Social Web and Health Research. Springer, Cham. https://doi.org/10.1007/978-3-030-14714-3_8

Download citation

Publish with us

Policies and ethics